assembly-tutorial

所属分类:其他
开发工具:Assembly
文件大小:0KB
下载次数:0
上传日期:2023-11-13 13:49:02
上 传 者sh-1993
说明:  汇编语言编程教程
(Programming in assembly language tutorial)

文件列表:
hello-world/ (0, 2023-11-13)
hello-world/build-linux.sh (151, 2023-11-13)
hello-world/build-macos-sh (152, 2023-11-13)
hello-world/hello (8296, 2023-11-13)
hello-world/hello-linux.asm (409, 2023-11-13)
hello-world/hello-macos.asm (414, 2023-11-13)
images/ (0, 2023-11-13)
images/cardiac2-s.jpg (44440, 2023-11-13)
instruction-set/ (0, 2023-11-13)
instruction-set/addressing.asm (1562, 2023-11-13)
instruction-set/build.sh (74, 2023-11-13)

# Programming in assembly language tutorial This tutorial covers AMD64/Intel 64 bit programming. Instruction sets for other processors, such as ARM or RISC-V are radically different, though the concepts are the same. They all have instructions, registers, stacks, and so on. Once you know one processor's assembly language, adapting to a different processor is rather easy. I found that I was writing code for a new processor within hours, and writing quality code within a week or two. This is going from Z80 to 6502 to 6809 to 8086 to 68000 and so on. It is interesting to be able to look at a processor's technical manuals and evaluate the power and flexibility of its instruction set. This tutorial is aimed at novices and beginners who want to learn the first thing about assembly language programming. If you are an expert, you may or may not get a lot out of this. - [Programming in assembly language tutorial](https://github.com/mschwartz/assembly-tutorial/blob/master/#programming-in-assembly-language-tutorial) - [Introduction](https://github.com/mschwartz/assembly-tutorial/blob/master/#introduction) - [Bits, Bytes, Words, and Number Bases](https://github.com/mschwartz/assembly-tutorial/blob/master/#bits-bytes-words-and-number-bases) - [Math](https://github.com/mschwartz/assembly-tutorial/blob/master/#math) - [Boolean Algebra](https://github.com/mschwartz/assembly-tutorial/blob/master/#boolean-algebra) - [Bit Shifting](https://github.com/mschwartz/assembly-tutorial/blob/master/#bit-shifting) - [Memory](https://github.com/mschwartz/assembly-tutorial/blob/master/#memory) - [ELF Files and the Loader](https://github.com/mschwartz/assembly-tutorial/blob/master/#elf-files-and-the-loader) - [Permissions](https://github.com/mschwartz/assembly-tutorial/blob/master/#permissions-sections-and-privileged-instructions) - [MMU](https://github.com/mschwartz/assembly-tutorial/blob/master/#mmu) - [Paging and Swapping](https://github.com/mschwartz/assembly-tutorial/blob/master/#paging-and-swapping) - [Other exceptions](https://github.com/mschwartz/assembly-tutorial/blob/master/#other-exceptions) - [Segfault](https://github.com/mschwartz/assembly-tutorial/blob/master/#segfault) - [Divide By Zero](https://github.com/mschwartz/assembly-tutorial/blob/master/#divide-by-zero) - [Invalid Opcode](https://github.com/mschwartz/assembly-tutorial/blob/master/#invalid-opcode) - [General Protection](https://github.com/mschwartz/assembly-tutorial/blob/master/#general-protection) - [ALU](https://github.com/mschwartz/assembly-tutorial/blob/master/#alu) - [x64/AMD64 Registers](https://github.com/mschwartz/assembly-tutorial/blob/master/#x64amd64-registers) - [General Purpose Registers](https://github.com/mschwartz/assembly-tutorial/blob/master/#general-purpose-registers) - [Special Purpose Registers](https://github.com/mschwartz/assembly-tutorial/blob/master/#special-purpose-registers) - [CPU Control Registers](https://github.com/mschwartz/assembly-tutorial/blob/master/#cpu-control-registers) - [Stack](https://github.com/mschwartz/assembly-tutorial/blob/master/#stack) - [Instruction Pointer](https://github.com/mschwartz/assembly-tutorial/blob/master/#instruction-pointer) - [Flags](https://github.com/mschwartz/assembly-tutorial/blob/master/#flags) - [AMD64 Instruction Set](https://github.com/mschwartz/assembly-tutorial/blob/master/#amd64-instruction-set) - [Assembly source](https://github.com/mschwartz/assembly-tutorial/blob/master/#assembly-source) - [Addressing Modes](https://github.com/mschwartz/assembly-tutorial/blob/master/#addressing-modes) - [Register Operands](https://github.com/mschwartz/assembly-tutorial/blob/master/#register-operands) - [Direct Memory Operands](https://github.com/mschwartz/assembly-tutorial/blob/master/#direct-memory-operands-better-known-as-immediate-operands) - [Indirect Operands](https://github.com/mschwartz/assembly-tutorial/blob/master/#indirect-operands) - [Indirect with Displacement](https://github.com/mschwartz/assembly-tutorial/blob/master/#indirect-with-displacement) - [Indirect with displacement and scaled index](https://github.com/mschwartz/assembly-tutorial/blob/master/#indirect-with-displacement-and-scaled-index) - [Commonly Used Instructions](https://github.com/mschwartz/assembly-tutorial/blob/master/#commonly-used-instructions) - [Aritmetic](https://github.com/mschwartz/assembly-tutorial/blob/master/#aritmetic) - [Boolean Algebra](https://github.com/mschwartz/assembly-tutorial/blob/master/#boolean-algebra-1) - [Branching and Subroutines](https://github.com/mschwartz/assembly-tutorial/blob/master/#branching-and-subroutines) - [Bit Manipulation](https://github.com/mschwartz/assembly-tutorial/blob/master/#bit-manipulation) - [Register Manipulation, Casting/Conversions](https://github.com/mschwartz/assembly-tutorial/blob/master/#register-manipulation-castingconversions) - [Flags Manipulation](https://github.com/mschwartz/assembly-tutorial/blob/master/#flags-manipulation) - [Stack Manipulation](https://github.com/mschwartz/assembly-tutorial/blob/master/#stack-manipulation) - [Assembler Source, Directives, and Macros](https://github.com/mschwartz/assembly-tutorial/blob/master/#assembler-source-directives--and-macros) - [Assembler Directives](https://github.com/mschwartz/assembly-tutorial/blob/master/#assembler-directives) - [section type](https://github.com/mschwartz/assembly-tutorial/blob/master/#section-type-options) - [bits 16, bits 32, and bits 64, use16, use32, use64](https://github.com/mschwartz/assembly-tutorial/blob/master/#bits-16-bits-32-and-bits-64-use16-use32-use64) - [Comments](https://github.com/mschwartz/assembly-tutorial/blob/master/#comments) - [Constants](https://github.com/mschwartz/assembly-tutorial/blob/master/#constants) - [Program Variables and Strings](https://github.com/mschwartz/assembly-tutorial/blob/master/#program-variables-and-strings) - [Assembler Variables and Labels](https://github.com/mschwartz/assembly-tutorial/blob/master/#assembler-variables-and-labels) - [Repetion](https://github.com/mschwartz/assembly-tutorial/blob/master/#repetion) - [Macros](https://github.com/mschwartz/assembly-tutorial/blob/master/#macros) - [Conditional Assembly](https://github.com/mschwartz/assembly-tutorial/blob/master/#conditional-assembly) - [Alignment](https://github.com/mschwartz/assembly-tutorial/blob/master/#alignment) - [Structures](https://github.com/mschwartz/assembly-tutorial/blob/master/#structures) - [Includes](https://github.com/mschwartz/assembly-tutorial/blob/master/#includes) - [Hello, World](https://github.com/mschwartz/assembly-tutorial/blob/master/#hello-world) - [MacOS Version](https://github.com/mschwartz/assembly-tutorial/blob/master/#macos-version) - [Linux version](https://github.com/mschwartz/assembly-tutorial/blob/master/#linux-version) - [How it works](https://github.com/mschwartz/assembly-tutorial/blob/master/#how-it-works) - [Linux Syscalls](https://github.com/mschwartz/assembly-tutorial/blob/master/#linux-syscalls) - [MacOS Syscalls](https://github.com/mschwartz/assembly-tutorial/blob/master/#macos-syscalls) ## Introduction How CPUs work has become something of a lost art. There are a small percentage of software engineers that need to understand the inner workings of CPUs, typically those who work on embedded software or operating systems, or compilers or JIT compilers... Assembly language was one of the first languages I ever learned. Back in the early/mid 1970s, my high school classes progressed from BASIC to FORTRAN IV, to BAL (Basic Assembly Language) for the IBM 360 to which we had access. One of the earliest lessons we were taught used a cardboard teaching aid, CARDIAC. CARDIAC stands for "CARDboard Illiustrative Aid to Computation"; it was developed at Bell Labs, which was a big deal back then (Unix was invented there, as well as the C programming language). See https://www.cs.drexel.edu/~bls96/museum/cardiac.html. With CARDIAC, you simulated the memory, operation, and CPU cycles of a mythical CPU. The numbers and instructions for this CPU were in base 10, so the student doesn't have to understand how to convert to the common base 2, base 8, 8 or base 16 used in computing. CARDIAC provided a cardboard device that had representation for memory, program steps, and ALU (math and logic operations). You wrote your program and variables on the cardboard and then, step by step, followed the program and performed the operations for each step. The steps are identified by a single digit, 0-9: - 0 INP read a card into memory - 1 CLA clear accumulator and add from memory - 2 ADD add from memory to accumulator - 3 TAC test accumulator and jump if negative - 4 SFT shift accumulator - 5 OUT write memory location to output card - 6 STO store accumulator to memory - 7 SUB subtract memory from accumulator - 8 JMP jump and save PC - 9 HRS halt and reset These values are "opcodes" and the encoded instructions/steps include the opcode plus address, number of bits to shift, etc. The CPU features only two registers: accumulator and program counter. More complex and modern CPUs have many more registers than these two. These instructions and registers are enough to learn from. You learn about memory layout, instruction opcodes, instruction encoding, memory access, and so on. In this tutorial, I will cover the basics of programming the x64/AMD64 CPU in assembly language. As I progress, you will see how the CPU is really a glorified version of CARDIAC! ## Bits, Bytes, Words, and Number Bases The smallest piece of information that a CPU processes is a "bit." A bit is a small integer or boolean type value, either 0 (off/false) or 1 (on/true). Bits are then organized as "bytes", or 8 bits grouped together. You can visualize a byte like this: ``` 76543210 ``` The digits represent what we call a bit number, and each digit (bits 0-7) may be a 0 or a 1. A byte can represent an unsigned value of 0-255, or a signed value of -128-127. Bit 7 of the byte is considered the "sign bit" - if it is 1, then the byte as a signed value is negative, if it is 0, then the byte is positive. Note that you decide whether the byte is processed as signed or unsigned; more on this later, but for now it is important to understand how the bits make up bytes and signed/unsigned values are represented. A "word" is two bytes grouped together, which means we have 16 bits together. You can visualize a word like this: ``` 5432109876543210 111111 ``` The high order, sign bit, is bit 15. The x86 also has DWORD values, which are two words combined. It also has QWORD values which are two DWORDs combined. The pattern is the same for any of these size values - the high bit is the sign bit, etc. From this point forward, I'll use "word" to mean one of these sized values, unless otherwise stated. When we talk about the value of the word, we typically use base 2, base 4, base 8, base 10, and base 16. Of these, base 8 isn't used much, but I'll explain a common use case for base 8. In base 2 (also called "binary"), we simply talk about the value as the bits. That is, an unsigned byte might be 11111111, or 11101110, and so on. We might add a lead 0 and terminating b for clarity (and this is the syntax used in assembly programming): 011111111b. Base 10 is the number base we use every day. You count from 0 to 9 for each digit position in base 10. When you add 1 to the value 9, you clear it (set to 0), and bump the 10s digit. That is, 9+1 becomes 10. As you go right to left in base 10, the digits are: n x 10 to the power of 0, n x 10 , or 10 to the power of 1, n x 100, or 10 to the power of 2, and so on. In base 2, we count from 0 to 1 for each digit position. When you add 1 to a 1 in a position in the byte, you clear it and increment the next higher bit (and continue until you find an existing 0 in position, which becomes 1). As you go right to left in base 2, the digits are n x 2 to the power of 0, n x 2, or 2 to the power of 1, n x 4, or 2 to the power of 2, and so on. In base 8 (also called "octal"), we count from 0-7 for each digit position. Going right to left, n x 8 to the power of 0, n x 8 to the power of 1, n x 8 to the power of 2, etc. In base 16 (also called "hex"), we count from 0-15 for each digit position. We use a counting system that is 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F, then 10. So going from right to left in a hex number, the digits are n x 16 to the power of 0, n x 16 to the power of 1, n x 16 to the power of 2, and so on. A "nybble" is useful for working with hex. A nybble is 4 bits. It turns out that the value you can store in 4 bits is 0-15, perfect for hex. You already get the pattern about power of 4s when using nybbles. Let's look at the unsigned value ranges for the common word sizes: ``` 1 bit: 0-1 2 bits: 0-3 3 bits: 0-7 4 bits: 0-15 5 bits, 0-31 ... ``` The pattern here is that the max value is 2 to the number of bits minus 1. That is for 5 bits, the max value 31 is 2 to the 5th power (32) minus 1. When we convert a binary byte to hex, we visualize it something like this: ``` 76543210 is 7654 3210 ``` We've grouped the bits as two nybbles. We can then convert the two nybbles (4 bits each) to two hex digits. This table makes the conversion simple. But if you practice using hex, you will know this table by heart. ``` 0000 | 0 0001 | 1 0010 | 2 0011 | 3 0100 | 4 0101 | 5 0110 | 6 0111 | 7 1000 | 8 1001 | 9 1010 | A 1011 | B 1100 | C 1101 | D 1110 | E 1111 | F ``` For example, we visualize the binary value 010100101b as 1010 0101. Using the table above, we see 1010 is A, and 0101 is 5. So the byte value is A5. We represent hex numbers in assembly as 0xa5, or 0a5h, or sometimes $a5. We can use the same scheme to convert 16 bit or 32 bit or 64 bit values to hex! I promised to discuss a use for Octal, something we might use every day. In the linux/mac/*nix filesystem, permissions are actually octal values. ``` -rw-r--r-- 1 mschwartz staff 5.9K Feb 16 14:13 README.md ``` See the -rw-r--r-- ? What we have here is 9 bits in octal. rw- is 110, r-- is 100, r-- is 100. So we can convert this to the internal filesystem representation of 644. If you want to make a file rw-r--r--, you use the chmod command: ``` chmod 644 README.md ``` The three bits, technically, are "able to read", "able to write", and "able to execute." The first octal value is for the owner, the second is for anyone in the same user group as the owner, and the third is for everyone else. So to allow the owner and group to read and write, but nobody else can read or write the file, we want rw-rw---- or 660. To set a file to be executable, I typically use ```chmod 755```. ## Math Adding two values of the same word size is simple. The byte 100 plus the byte 50 = 150. 100 + 50 = 150. This works for signed and unsigned values. The math is always unsigned, but the result is up to you. If the high order bit (bit 7 of a byte, bit 15 of a 16-bit word...) is 1, the signed value is negative. What happens when we add a byte value to a 16-bit word value? The byte value is really a 16-bit value, but the upper 8 bits are zeros. That is, 0xaa can be visualized as 0x00aa. We just add the full 16-bit values together. What happens when we add 1 to a byte size value of 255? We only have 8 bits for the result, but we have 9 bits of actual value. That is, 255 + 1 is 256. Represented in binary, you have 255 = 011111111b + 1 = 0100000000b (9 bits!). The 9th bit is basically ignored as far as the result byte goes (more on this later). So if you look at the lower 8 bits of our 9 bit result, we get 0! All this extends to 32 bit and 64 bit words. Multiplication of two values requires a double-sized result, or you lose a lot more than just the 9th bit! Consider 255 x 255 = 65025 (0xfe01), which fits in 16 bits but not in 8. If we have a byte result, we get 0x01 due to the overflow, losing over 65000 in result value. ## Boolean Algebra Boolean Algebra is a form of math that we use to deal with true/false values. We use Boolean Algebra all the time in various programming languages, with operators like & (AND), | (OR), ^ (exclusive XOR, or XOR), and ! (NOT), ~ (also NOT) and so on. These operators are equivalent to "math-like" operators. The simplest way to visualize Boolean Algebra is using single bit values and truth tables. 0 = false, 1 = true. For single bit value operands, there are only (always) 4 combinations possible. ``` AND (if both operands are true, the result is true) 0 & 0 = 0 0 & 1 = 0 1 & 0 = 0 1 & 1 = 1 OR (if either operand is true, the result is true) 0 | 0 = 0 0 | 1 = 1 1 | 0 = 1 1 | 1 = 1 XOR (if only one operand is true, the result is true) 0 ^ 0 = 0 0 ^ 1 = 1 1 ^ 0 = 1 1 ^ 1 = 0 ``` The ! (NOT) operator only has one operand. If the operand is true, the result is false. If the operand is false, the result is true. The result is also known as a 1's complement, or we've just inverted the state of all the bits. The ~ (1's complement) operator inverts the bits in the word. If we look at the operands as byte values, we have something like: ``` 00000000 & 00000000 = 0 00000000 & 00000001 = 0 ... ``` BUT, we have 8 bits, so the operation is performed on all 8 bits in the two operands. ``` 10000000 OR 00000001 -------- ^ ^ = 10000001 ^ ^ NOT 10000001 = 01111110 ``` This is a most important concept to grasp! We use the Boolean Algebra operators on words to achieve useful results. A typical use of the AND operator is to clear bits in a value. If we AND with a value that is the inverse of a power of 2, we are simply clearing a bit. n AND !4 clears bit 3 in n. A typical use of the OR operator is to set bits in a value. If we OR with a value that is a power of 2, we are simply setting a bit. n OR 4 sets bit 3 in n. A great use of the AND operator is to do a modulo of a number to a power of 2. For example, AND with 3 gets you a result between 0 and 3. AND with 7 gets you a result between 0 and 7. ## Bit Shifting You can shift a byte to the left (<< operator in C) 1-7 bits. For example: ``` 001111101b << 1 = 001111100b 001111101b shifted left becomes //////// x01111100b (bit 0 becomes 0) ``` Note that we have the overflow problem here, as we did with addition. We have an upper bit that ends up in the "bit bucket" (thrown away). A left shift of 1 bit is effectively a multiplication by 2. Consider 001b<<1 is 010b, or 2. A left shift of 2 bits is a multiply by 4, and so on. Shifting to the right works similarly, but we now end up with the high bit being cleared and the low bit in the bit bucket. A right shift of 1 bit is effectively a divide by 2. But this right shift will take a negative number and make it positive because the sign bit is cleared. So we need a second kind of right shift (arithmetic shift right) for signed values that sets the high bit in the result to the high bit in the initial value. A rotation left/right is the same as a shift, except instead of the lost bit ending up in the bit bucket, it becomes the new high/low bit. Other than for the multiply and divide effects, we use bit shifting frequently with Boolean Algebra. To set bit 3: ``` n | (1<<3) To clear bit 3: n & ~(1<<3) Note that 1<<3 = 01000b, and ~(1<<3) is ~01000b or 00111b. (all the bits are inverted) When you AND with 00111b, you are clearing bit 3. ``` ## Memory Memory (RAM) can be viewed as an array of bytes. If you have 1MB of RAM, your array is indexed from 0 to 1MB-1. The index is better known as an address. Memory is used to store your program, for your program stack, for your program's heap (memory allocation) and to store your variables. In a simple CPU and RAM setup, you might have your program start at index 0, your variables start at the end of the program, your heap starts at the end of your variables, and your stack starts at the top of memory and works its way downward as you push onto it. ``` HIGH memory address +--------------+ | | | stack | | grows down | | address 1M | | | +--------------+ | | | heap | | grows up | | | +--------------+ | | | uninitalized | | global | | variables | | | +--------------+ | | | initalized | | global | | variables | | | +--------------+ | | | code | | address 0 | | | +--------------+ LOW memory address ``` ## ELF Files and the Loader The compiler/assembler/linker generate ELF formatted files. An ELF file is divided into various sections. The more common sections are ```.text``` (code), ```.data``` initialized data, ```.rodata``` read only data (constants), ```.bss``` (uninitialized data), and assorted debugging info sections. The operating system program loader reads in the ELF file and allocates memory for the .text section and loads that data from the file into that memory. Then the loader allocates memory for the initialized data (.data) and reads that data from the file into that memory. Then the loader allocates memory for the constant data (.rodata) and reads that data from the file into that memory. The loader allocates memory for the .bss section. Since the .bss section is uninitialized, it only needs to be allocated. The linker reads in intermediate object files (```.o```) and links them together to make the final executable. Each .o file may declare variables that might be accessed from other .o files and to access variables that are defined in some other .o file. The linker fixes up the addresses in the final output so the code works as expected! ### Permissions (Sections and Privileged Instructions) The compiler/assembler/linker generally makes the code execute only. If you try to store to those addresses, you will get a segfault. The .data and .bss sections are marked as read/write and the .rodata is marked as read-only. The way words of the different sizes are stored in memory is determined by the "endianess" of the CPU. A CPU that is big endian stores the high byte first in memory, the next highest byte next, ... and finally the lowest byte last. A CPU that is little endian stores the low byte first, ... the high byte last. The CPU has special features that enforce these permissions. If you try to defeat the permissions, a segfault exception is thrown. The operating system sets up these features when the program is started, and kills the program and potentially generates a core dump file of the program. The core dump file can be used later to do forensic debugging/analysis of the failure. ### MMU In modern operating systems, the CPU uses an MMU (Memory Mana ... ...

近期下载者

相关文件


收藏者