august

所属分类:汇编语言
开发工具:Assembly
文件大小:0KB
下载次数:0
上传日期:2021-04-25 16:02:03
上 传 者sh-1993
说明:  从头开始用Ink编写的汇编程序,支持x86_64和更多版本上的ELF。
(Assembler from scratch written in Ink, supporting ELF on x86_64 _and_ more.)

文件列表:
LICENSE (1066, 2021-04-25)
Makefile (259, 2021-04-25)
lib/ (0, 2021-04-25)
lib/bytes.ink (1181, 2021-04-25)
src/ (0, 2021-04-25)
src/asm.ink (17271, 2021-04-25)
src/cli.ink (969, 2021-04-25)
src/elf.ink (7510, 2021-04-25)
src/test.ink (1613, 2021-04-25)
test/ (0, 2021-04-25)
test/asm/ (0, 2021-04-25)
test/asm/000.asm (760, 2021-04-25)
test/asm/001.asm (47, 2021-04-25)
test/asm/002.asm (65, 2021-04-25)
test/asm/003.asm (191, 2021-04-25)
test/asm/004-sym.asm (312, 2021-04-25)
test/asm/004.asm (224, 2021-04-25)
test/asm/005.asm (446, 2021-04-25)
test/asm/006.asm (443, 2021-04-25)
test/asm/007.asm (760, 2021-04-25)
vendor/ (0, 2021-04-25)
vendor/cli.ink (1105, 2021-04-25)
vendor/quicksort.ink (966, 2021-04-25)
vendor/std.ink (6811, 2021-04-25)
vendor/str.ink (4191, 2021-04-25)
vendor/suite.ink (1271, 2021-04-25)

# August **August** is an assembler written from scratch in [Ink](https://dotink.co/) for me to learn about assemblers, linkers, executable file formats, and compiler backends. It currently supports assembling and linking (in a single step) x86_64 [ELF](https://en.wikipedia.org/wiki/Executable_and_Linkable_Format) binaries for Linux, and might in the future support ELF executables for ARM, RISC-V, and x86 architectures. In the far long term, August might also become a code generation backend for a compiler written in Ink for some small subset of C if I feel adventurous. But for now, August is an educational project that assembles a subset of x86_64 to a Linux ELF binary. August currently supports the following features: - A good portable subset of the integer x86_64 instruction set - Support for arguments as immediates, registers, and labels - Embedded read-only data segments - Symbol tables for debugging and disassembly You can see some example assembly code that August can assemble and link under [`test/`](test/). ## Design August provides a CLI, `./src/cli.ink`, that currently takes a single assembly program and emits a single statically-linked x86_64 ELF executable. Under the hood, August reads the assembly program, parses it into a simple representation of symbols and sections in the source, assembles it into machine code, and links it all together with a minimal ELF linker. At the moment, the assembler and linker are pretty tightly integrated. The ELF linker assumes that only two sections are used, `.text` and `.rodata`, and the assembler generates code with that assumption. The virtual address table for the generated executable is also currently hard-coded into the linker and relied on by the assembler when resolving symbols. Here's a transcript of a shell session that demonstrates what August can do today. We take a bare-bones Hello World program for Linux on x86_64, assemble it with August, run it, and dump the generated assembly with `objdump`. ```asm $ cat test/asm/004-sym.asm ; Hello World section .text ; implicit _start: mov eax 0x1 ; write syscall mov edi 0x1 ; stdout mov esi msg ; string to print mov edx len ; length syscall exit: mov eax 60 ; exit syscall mov edi 0 ; exit code syscall section .rodata msg: db "Hello, World!" 0xa len: eq 14 ``` Run the emitted program, which prints, "Hello, World!" and exits cleanly. ```bash $ august test/asm/004.asm ./hello-world executable written. $ ./hello-world Hello, World! $ echo $? 0 ``` If we disassemble the generated executable, we find the assembly we began with. ```asm $ objdump -d ./hello-world ./hello-world: file format elf64-x86-64 Disassembly of section .text: 0000000000401000 <_start>: 401000: b8 01 00 00 00 mov eax,0x1 401005: bf 01 00 00 00 mov edi,0x1 40100a: be 00 50 6b 00 mov esi,0x6b5000 40100f: ba 0e 00 00 00 mov edx,0xe 401014: 0f 05 syscall 0000000000401016 : 401016: b8 3c 00 00 00 mov eax,0x3c 40101b: bf 00 00 00 00 mov edi,0x0 401020: 0f 05 syscall ... ``` ### Assembler The instruction encoding is handled by the [`./src/asm.ink`](src/asm.ink) library within the project. Currently, August can assemble simple programs that work with 32-bit registers and the ALU, handle branches and jumps, make system calls and function calls per the x86 calling convention, and read or write to memory. Even with these basic building blocks, we can write programs that do interesting things like loop, manipulate memory, and make recursive calls. You can check out some examples in [`test/asm/`](test/asm/). ### ELF Linker August uses a library for constructing ELF executable files located at [`./src/elf.ink`](src/elf.ink). The ELF generated by the ELF library in August currently makes use of three sections: - `.text` containing the program text, i.e. translated x64 assembly. - `.rodata` containing read-only data loaded into process memory as read-only - `.shstrtab` containing section headers The content of `.text` and `.rodata` sections can be provided to the ELF library, which will return a fully linked ELF binary as the result. All labels found in the assembly code are treated as local function symbols and placed into the generated symbol table. ## References and further reading The ELF file format is quite well documented, especially in source bases of various linkers, assemblers, and kernels, but the available reference material for _implementing_ an ELF linker is not...what you would call super accessible. In the process of building August, I've found the following references particularly helpful. - [`man elf` on Linux](https://man7.org/linux/man-pages/man5/elf.5.html) and the `elf` header file in the kernel sources, which provide the canonical reference for implementations of ELF files - [A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux](http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html), which breaks down the ELF format for executable files at a high level - [LWN's write-up of the Linux kernel's view of ELF executables](https://lwn.net/Articles/631631/), with another breakdown of ELF executables - [Solaris's documentation on ELF object files](https://docs.oracle.com/cd/E53394_01/html/E54813/chapter6-93046.html#scrolltoc), a good in-depth reference - [Notes on the ELF specification](http://www.muppetlabs.com/~breadbox/software/ELF.txt), which is long but very, very comprehensive, occasionally useful for studying edge cases In writing an x86/x64 assembler, the following were especially helpful to get me up to speed. - [The x86asm.net ISA reference](http://ref.x86asm.net/coder64.html), which is comprehensive enough for a toy assembler and easy to navigate once you get used to the compact notation - [Encoding x86 Instructions](http://www.cs.loyola.edu/~binkley/371/Encoding_Real_x86_Instructions.html), which was a helpful guide to understanding how x86 and x64 instructions are encoded - The [x64 cheat sheet](http://cs.brown.edu/courses/cs033/docs/guides/x64_cheatsheet.pdf) for a handy list of the core x86/x64 instruction set - The [Calling Conventions](https://wiki.osdev.org/Calling_Conventions) article on OSDev Wiki ## Development To work on August, you obviously need [Ink](https://dotink.co/) installed. [Inkfmt](https://github.com/thesephist/inkfmt) is also useful for auto-formatting code, which you can run with `make format` or `make f`. When I work on August (especially the instruction encoder), I usually have two other panes open, running: - `ls test/asm/*.asm lib/*.ink src/*.ink | entr -cr make` so every file change assembles and runs a program to test - `ls ./b.out | entr -cr objdump -d -Mintel ./b.out` so that every time the executable is re-compiled, I can see the disassembly of the executable and check it against the intended assembly code. There is a growing test suite for the assembler / x86 instruction encoder, which you can run with `make check` or `make t`.

近期下载者

相关文件


收藏者