assemblyShenanigans

所属分类:汇编语言
开发工具:Assembly
文件大小:0KB
下载次数:0
上传日期:2022-08-08 17:44:59
上 传 者sh-1993
说明:  我试图教其他人关于IA-32和IA-64汇编中的微处理器和编程,并传播世界...
(My attempt to teach others about microprocessors and programming in IA-32 and IA-64 assembly, and to spread the word of how awesome it is.)

文件列表:
ia-32/ (0, 2022-08-08)
ia-32/1. helloWorld.asm (2306, 2022-08-08)
ia-64/ (0, 2022-08-08)
ia-64/1.HelloWorld.asm (231, 2022-08-08)
ia-64/2.Endianness.asm (383, 2022-08-08)
ia-64/3.MovExceptions.asm (203, 2022-08-08)
ia-64/3.MovingData.asm (570, 2022-08-08)
ia-64/3.lea.asm (250, 2022-08-08)
ia-64/4.Stack.asm (307, 2022-08-08)
ia-64/5.Arithmetic.asm (557, 2022-08-08)
ia-64/5.CarryArithmetic.asm (262, 2022-08-08)
ia-64/6.Logical.asm (188, 2022-08-08)
ia-64/7.loops.asm (430, 2022-08-08)
ia-64/8.procedure.asm (329, 2022-08-08)
ia-64/9.stringManipulation.asm (666, 2022-08-08)
images/ (0, 2022-08-08)
images/stack-frame.png (107756, 2022-08-08)
images/unconditional-jumps.png (143487, 2022-08-08)

# Assembly Shenanigans My attempt to teach others about microprocessors and programming in IA-32 and IA-64 assembly, and to spread the word of how awesome it is. # Table of contents - [Pre-requisites](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#pre-requisites) - [Architecture Model](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#architecture-model) - [ CPU ](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#cpu) - [Registers](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#registers) - [Bus](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#bus) - [Clock Speed](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#clock-speed) - [Fetch-decode-execute cycle](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#fetch-decode-execute-cycle) - [Memory addressing ](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#memory-addressing) - [Instructions](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#instructions) - [Instruction Set Architecture](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#instruction-set-architecture) - [Basics](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#basics) - [Approaches to ISA on the basis of architectural complexity](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#approaches-to-isa-on-the-basis-of-architectural-complexity) - [Microarchitecture](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#microarchitecture) - [What does 32 and 64 bit actually mean?](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#what-does-32-bit-and-64-bit-etc-actually-mean) - [Micro-processor, micro-controller, and micro-computer](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#micro-processor%2C-micro-controller%2C-and-micro-computer) - [Difference b/w CPU, Processor and Core](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#difference-b%2Fw-cpu%2C-processor-and-core) - [x86_64 Assembly](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#x86_64-assembly) - [Getting started](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#getting-started) - [Installation](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#installing-the-required-tools) - [Understanding process memory maps](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#understanding-how-does-a-program-looks-like-in-the-memory) - [Looking at memory maps](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#looking-at-the-process-memory-map) - [The boilerplate code](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#the-boilerplate-code) - [Compiling and running our code](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#compiling-and-running-the-code) - [Basics](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#basics-of-assembly) - [Fundamental Data Types](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#fundamental-data-types) - [Declaring initialized data](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#declaring-initialized-data) - [Declaring uninitialized data](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#declaring-un-initialized-data) - [The instruction set](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#the-instruction-set) - [Moving data around](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#moving-data-around) - [Arithmetic operations](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#arithmetic-operations) - [Logical operations](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#logical-operations) - [More advanced concepts](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#more-advanced-concepts) - [Loops](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#loops) - [Jumps](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#jumps) - [Procedures](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#procedures) - [Basics](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#understanding-procedures) - [Anatomy of a CALL instruction](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#anatomy-of-a-call-instruction) - [Anatomy of a RET instruction](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#anatomy-of-a-ret-instruction) - [Stack Frames](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#stack-frames) - [Resources](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/#resources) # Pre-requisites ## Architecture model The core elements of today's modern computing devices are consistent with those designed in the dawning phase of technology. So it's always good to study them beforehand, before moving onto its complex counterparts. Architecture model | Description -|- Von Neumann | According to this architecture model, data and memory addresses in the same memory (you'll come to understand more about this later, the distinction is important in the case of [shellcoding](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/https://en.wikipedia.org/wiki/Shellcode)). Harvard Architecture | According to this model, the data and the address are stored in different places ## CPU A CPU has many internal components which we will discuss about, one by one; namely: Control Unit, Arithmetic Logic Unit (ALU), Registers, Cache, and Buses. Name | Description -|- Control Unit |
  • It acts as a supervisor for different components of the CPU.
  • Controls the fetch-decode-execute cycle
  • Arithmetic Logic Unit (ALU) |
  • Consists of **Arithmetic Unit** and **Logic Unit**
  • The arithmetic unit is responsible for performing mathematical operations (addition, subtraction and the likes)
  • The Logic Unit is responsible for the logical operations (XOR, AND, OR etc)
  • Registers |
  • The smallest data holding elements that are build into the CPU and are directly accessible without any performance penalty
  • They're used to store instructions and values in the CPU that help execute the instructions without having to refer back to the main memory which is an expensive operation
  • Their storage capacity is limited in nature, which depends on the architecture, for example, registers are 64-bit in size in the case of `amd64`, and they're also limited in number
  • CPU Clock | Speaking from the low level perspective, the CPU is just another creation of sequential and combinational logic. We need a clock to synchronize the internal circuitry. The clock does the job, by sending electric pulses at regular intervals, which is able to dictate how fast the CPU is able to execute its internal logic. Cache |
  • They were designed, because without them, a microprocessor would have to sit idle for many cycles until the required data could come into the registers from the main memory.
  • They're built into the processor, and used to proactively store data pulled from the memory to enable fast access
  • **Cache Coherency**: A concept related to multi-threading/ multi-processing environments, where more than one entity might be looking at certain information. When that information is updated, it must be updated across all the places it’s stored at, be it the cache, the registers, RAM etc, otherwise problems will occur if obsolete data is in place.
  • ## Registers The number of possible registers depends from architecture to architecture, but they can be categorized into: Name | Description -|- Accumulator | The most frequently used register, sometimes built into the ALU, used to store intermediary data when logical/ arithmetic calculations are being done Instruction register | Holds the instruction which is just about to be executed by the processor. Program Counter (Instruction Pointer) | Used to keep track of the execution, and points to the next instruction which needs to be executed after the current one Counters | Used in loops Stack/ Base Pointer | Used to point to the top and base of the stack respectively, extremely important to understand the concept of _Stack Frames_. FLAGS | A register in which each bit is independent of one another, and stores information about the current status at any given stage of program execution. Additional registers | Depends on the architecture, and they're extensions to the basic set of registers, such as x87, MMX, SSE etc. In the case of x86_64, general purpose registers are 64-bits in size. - The lower 32-bits of `RAX`, `RBX`, `RCX`, `RDX` can be accessed via `EAX`, `EBX`, `ECX`, and `EDX`, their lower 16-bits by `AX`, `BX`, `CX` and `DX`. The lower half of the said 16-bits by `AL`, `BL`, `CL`, `DL` and the upper half by `AH`, `BH`, `CH` and `DH`. - GPR `RSI`, `RDI`, `RBP` and `RSP` are 64-bits in size, and their lower 32-bits can be accessed by `ESI`, `EDI`, `EBP`, `ESP`, and their lower 16-bits can be accessed by `SI`, `DI`, `BP`, `SP` and their lower 8- bits can be accessed by `SIL`, `DIL`, `BPL` and `SPL`. - There are other 8 GPR, named from `R8` - `R15`. The whole 64-bits can be accessed via `R8`, the lower 32-bits via `R8D` (double word), the lower 16-bits via `R8W` (R8-word), and further lower 8-bits via `R8B` (R8-byte) - Due to various design decisions made during the design of x86_64, accessing EAX would wipe out the upper 32-bits of the RAX register (and all other GPRs) ``` General purpose registers: 64 bit RSI, RDI, RSP, RBP RAX, RBX, RCX, RDX R8, R9, R10, R11, R12, R13, R14, R15 ┌──────────────────RAX──────────────────┐ ┌─────────────RSI/R8────────────────────┐ ┌───────────────────┬─────────┬────┬────┐ ┌───────────────────┬─────────┬────┬────┐ │ │ │ │ │ │ │ │ │SIL │ │ │ │AH │AL │ │ │ │ │R8B │ └───────────────────┴─────────┴────┴────┘ └───────────────────┴─────────┴────┴────┘ └───AX────┘ └─SI/R8W──┘ └───────EAX─────────┘ └────ESI/R8D────────┘ ``` ## Bus - A bus is a group of wires having common functionality, and they're used to interconnect stuff internally within the CPU. - Some higher end systems use switch instead of the bus-based architecture but that's outside the scope of this post. Name | Description -|- Control Bus | Bi-directional in nature (CPU <---> other parts), and are used to control the data flow. Control signals are transferred through this bus, and they synchronize everything connected to the data bus. Address Bus |
  • Unidirectional in nature (CPU ---> other parts), and are used to transfer addresses from the microprocessor to the other components. Memory addresses are transferred through this lines.
  • Used to define the amount of addressable memory by the microprocessor, say if there are 16-address lines (like that in Intel 8085), $$2^{16}$$ memory addresses can be addressed by the microprocessor, and $$2^{16}$$ bytes of memory if we consider it to be byte-addressable.
  • Data Bus |
  • They are used for data transmission b/w the micro-processor and other peripherals, and within the microprocessor as well
  • They're bidirectional in nature and are used to define the native word size
  • ## Clock Speed The CPU works on the basis Fetch-Decode-Execute cycle, the clock rate of a CPU, is the number of times this cycle occurs per second. It’s often used as an indication of processor's speed. ## Fetch Decode Execute Cycle - Most of the modern day CPUs support _stored program execution_, which means the instructions to be executed will firstly exist in the memory, which will later be fetched into the registers, decoded and executed. This process is known as **Fetch Decode Execute**. - The Control Unit drives the fetch, decode, execute and store functions of the processor ``` initialise the program counter repeat forever fetch instruction increment the program counter decode the instruction execute the instruction end repeat ┌────────────┐ ┌──────Control Unit├────┐ │ └────────────┘ │Execute Decode │ │ ┌──┴──────┐ ┌─▼─┐ │Registers───────────────┤ALU│ └─────────┘ Fetch └───┘ ``` Step | Description -|- Fetch |
  • The CPU fetches instructions from the physical memory using their memory addresses (mentioned in Program Counter/ Instruction Pointer), which is then stored in the Instruction Register.
  • Before the instruction is fetched, the Control Unit generates and sends out a control signal (Memory Read) to the primary memory to let it know it’s about to get accessed, then the the instruction is fetched through the data lines.
  • Decode |
  • The CPU interprets the binary instruction to determine what task it’s supposed to perform and transfers the data needed to the registers to prepare to execute the specific instruction
  • Instructions are formatted in a particular way to enable efficient decoding, and it specifies opcode (operations to be performed) and operands (what to perform the operations on), and also the addressing mode.
  • Decoder circuity is used here (such as 8 to 256 line decoder and all)
  • Execute |
  • At this stage, binary instruction is decoded and one of the output lines is applied, to perform the task in hand, whatever it may be.
  • After execution of an instruction is done, the instruction pointer (Program Counter) now points to a new location where the next instruction will be stored and this cycle repeats again.
  • ## Memory Addressing ``` The number and order of operands depends on the instruction addressing mode as follows: Addressing Modes Register Direct: Both the operands are registers ADD EAX, EAX Register Indirect:Both the operands are registers, but contains the address where the operands are stored in memory MOV ECX, [EBX] Immediate: The operand is included immediately after the instruction in memory ADD EAX, 10 Indexed: The address is calculated using a base address plus an index, which can be another register MOV A, [ESI+0x4010000] MOV EAX, [EBX+EDI] ``` ## Instructions Name | Description -|- Mnemonics |
  • These are the mappings for the binary machines codes so as to enable faster writing, and debugging of code.
  • We need an assembler to convert assembly (mnemonics) code to native format
  • The mappings are defined by the ISA, such as in the 8085 architecture, A register is the mapping for 111, and ADD is the mapping for 10000, when we assemble our code for say ADD A it would get translated to (10000)(111)
  • Machine code | They can be understood by the micro-processor directly w/o any need of middle man. ## Instruction Set Architecture ### Basics - Instructions are defined as per a specification, which is known as the Instruction Set Architecture (ISA). It's specifies things such as type and size of operands, register states, memory model, how interrupts and exceptions are handled etc viz. it's the syntax and semantics. - Some examples are: x86, x86_64, ARM, MIPS, Power PC, RISC-V etc ### Approaches to ISA on the basis of architectural complexity Name | Description -|- Complex Instruction Sets |
  • More work is done in a single instruction (capable of multi-step operations), and takes as much time as it needs for execution.
  • Many instructions are supported
  • A computer built with this set is known as Complex Instruction Set Computer (CISC)
  • Example: Motorola 6800, Intel 8051 family.
  • Reduced Instruction Sets |
  • An optimised set of instructions that the CPU can execute quickly
  • Supports less number of instructions
  • A computer built with this instruction set is known as Reduced Instruction Set Computer (RISC)
  • Example: RISC-V, PowerPC
  • Some other approaches are: Minimal Instruction Set Computer (MISC), One Instruction Set Computer (OISC) and Very Long Instruction Word (VLIW), LIW (Long Instruction Word) but these are not so common these days. ### Microarchitecture Micro-architecture is how the instruction set is implemented. There are multiple micro-architecture that support the same ISA, such as such as both Intel and AMD support the x86 ISA, but they have different implementation (micro-architecture) ### What does 32-bit and 64-bit etc actually mean - Used to define the native word-size of the ISA, and that is what the CPU processes at once viz. if the word size is 1 byte, 1 byte of data can be processed in a single fetch-decode-execute cycle - If there are 8-data lines as per the ISA, it means 8-bits can be transferred simultaneously at once, viz. the each distinct register can store 8 bits each, thus the CPU is 8-bit in nature. The address bus is irrelevant with classification of CPUs. - The native word size also defines the addressable memory, because special purpose registers (program counter, instruction register) are used as pointers to memory location, and the native word size defines the sizes of these registers. - A 32/64 bit program has different meaning from a 32/64 bit CPU. A 32-bit program means the CPU will operate in 32-bit mode, and only $2^{32}$ addresses will be accessible. ## Micro-processor, micro-controller, and micro-computer Name | Description -|- Micro-processor | An electronic chip functioning as the CPU of computer Micro-controller | It’s the combination of micro-processor, I/O ports, and memory altogether. Micro-computer | A computer having a microprocessor and limited resources is known as a micro-computer, and is the combination of a micro-controller, I/O devices and memory. ## Difference b/w CPU, Processor and Core ``` CPU = the hardware that executes instructions, can have multiple cores in it Processor = A physical chip containing one or more CPUs Core = The basic computational unit of CPU Multicore = Having multiple cores on the same CPU Multiprocessor = Having multiple processors ``` # x86_64 assembly ## Getting started ### Installing the required tools - Installing the required tools ```sh sudo apt install build-essential clang nasm gdb gdbserver ``` - A text editor, I personally use [neovim](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/https://neovim.io/) - A guest OS (x86_64) ### Understanding how does a program looks like in the memory Read more about this here: [link](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/https://s4dr0t1.github.io/docs/programming/pl-basics) ``` ┌───────┐ │ Stack │ Grows downwards │ │ │ Contains things that are local │ │ │ to a function (local variables, │ ▼ │ return addresses, parameters etc) ├───────┤ │ Heap │ Dynamic memory allocation takes place ├───────┤ │ Data │ Initialized global/static variables ├───────┤ │ BSS │ Contains uninitialized data ├───────┤ │ Text │ Contains our program code └───────┘ ``` ### Looking at the process memory map ```sh # Using gdb $ gdb -q ./binary $ break $ run $ info proc mappings # Using pmap pmap ``` ### The boilerplate code ```nasm ;;The start symbol, during the start of the execution, the execution flow will jump to the address pointed to, by the label _start global _start section .text ;;The executable code goes here section .data ;;Initialized data goes here section .bss ;;Uninitialized data goes here ``` ### Compiling and running the code - Read more about assembly, linking and such stuff [here](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/https://s4dr0t1.github.io/docs/programming/pl-basics) - Read more about position independent code [here](https://github.com/s4dr0t1/assemblyShenanigans/blob/master/https://en.wikipedia.org/wiki/Position-independent_code) ```sh # Assembly the code $ nasm ./code.asm -f elf64 -o output.o # Linking $ ld output.o -o finalExecutable #Use the -pie flag to get position independent code $ ./finalExecutable ``` ## Basics of assembly ### Fundamental data types Name | Size | Instruction -|-|- Byte | 8 bits | `db` Word | 16 bits | `dw` Double Word | 16 * 2 bits | `dd` Quad Word | 16 * 4 bits | `dq` Double Quad Word | 16 * 8 bits | `ddq` ### Declaring initialized data ```nasm ;;Defining the byte 0x23 db 0x23 ;;Defining two bytes successive in memory 0x12, 0x34, 0x56 db 0x12, 0x34, 0x56 ;;Defining a character constant and a byte db 'x', 0x00 ;;Defining a string constant and a byte in succession db 'hi', 0x10 ;;Defining a word (2 bytes, 16 bits) dw 0x1234 ; 0x34 0x12 (little-endian) dw 'a' ; 0x61 0x00 dw 'ab' ; 0x61 0x62 ;;Defining a double word (32 bits, 4 bytes) dd 0x12345678 ; 0x12 0x34 0x56 0x78 ;;Defining a Quad Word (64 bits, 8 bytes) dq 0x123456789abcdef0 ``` ### Declaring un-initialized data Uninitialized data is stored in the `.BSS` section, and since they're un-initialized in nature, no memory needs to be allocated for their storage, and they can just exist inside the object file. ```nasm ;;Reserve a byte section .bss label: resb ;;the label will point to the first byte ;;Reserve a word section .bss label: resw ;;the laebl will point to the first byte ``` ## The instruction set ### Moving data around If we’re moving 64-bit data into a 64-bit register, the data will occupy the whole register. But when the data is of 32-bits, the lower 32-bits will be occupied by the data and the rest will be zeroed out. When dealing with 8 or 16-bit operands, the other bits will not be modified. #### `MOV` instruction ```nasm ;;B/w registers mov registerA, registerB ;;Memory to r ... ...

    近期下载者

    相关文件


    收藏者