assembler

所属分类:嵌入式/单片机/硬件编程
开发工具:GO
文件大小:0KB
下载次数:0
上传日期:2020-12-02 07:14:55
上 传 者sh-1993
说明:  基本X86-64汇编程序,用golang编写,
(Basic X86 -64 assembler, written in golang,)

文件列表:
LICENSE (18092, 2020-12-01)
call.asm (2840, 2020-12-01)
cmd/ (0, 2020-12-01)
cmd/assembler/ (0, 2020-12-01)
cmd/assembler/main.go (510, 2020-12-01)
cmd/lexer/ (0, 2020-12-01)
cmd/lexer/main.go (499, 2020-12-01)
cmd/parser/ (0, 2020-12-01)
cmd/parser/main.go (452, 2020-12-01)
compiler/ (0, 2020-12-01)
compiler/compiler.go (20807, 2020-12-01)
elf/ (0, 2020-12-01)
elf/elf.go (4418, 2020-12-01)
exit.asm (256, 2020-12-01)
go.mod (41, 2020-12-01)
hello.asm (858, 2020-12-01)
instructions/ (0, 2020-12-01)
instructions/instructions.go (1840, 2020-12-01)
jmp.asm (279, 2020-12-01)
lexer/ (0, 2020-12-01)
lexer/lexer.go (5252, 2020-12-01)
lexer/lexer_test.go (3663, 2020-12-01)
parser/ (0, 2020-12-01)
parser/ast.go (2470, 2020-12-01)
parser/parser.go (7255, 2020-12-01)
parser/parser_test.go (1577, 2020-12-01)
test.asm (871, 2020-12-01)
token/ (0, 2020-12-01)
token/token.go (1659, 2020-12-01)
token/token_test.go (458, 2020-12-01)

[![GoDoc](https://img.shields.io/static/v1?label=godoc&message=reference&color=blue)](https://pkg.go.dev/github.com/skx/assembler) [![Go Report Card](https://goreportcard.com/badge/github.com/skx/assembler)](https://goreportcard.com/report/github.com/skx/assembler) [![license](https://img.shields.io/github/license/skx/assembler.svg)](https://github.com/skx/assembler/blob/master/LICENSE) * [Assembler](#assembler) * [Limitations](#limitations) * [Installation](#installation) * [Example Usage](#example-usage) * [Internals](#internals) * [Adding New Instructions](#adding-new-instructions) * [Debugging Generated Binaries](#debugging-generated-binaries) * [Bugs?](#bugs) # Assembler This repository contains a VERY BASIC x86-64 assembler, which is capable of reading assembly-language input, and generating a staticly linked ELF binary output. It is more a proof-of-concept than a useful assembler, but I hope to take it to the state where it can compile the kind of x86-64 assembly I produce in some of my other projects. Currently the assembler will generate a binary which looks like this: ``` $ file a.out a.out: ELF 64-bit LSB executable, x86-64, version 1 (SYSV) statically linked, no section header ``` Why? I've written a couple of toy projects that generate assembly language programs, then pass them through an assembler: * [brainfuck compiler](https://github.com/skx/bfcc/) * [math compiler](https://github.com/skx/math-compiler/) The code in this repository was born out of the process of experimenting with generating an ELF binary directly. A necessary learning-process. ## Limitations We don't support anywhere near the complete instruction-set which an assembly language programmer would expect. Currently we support only things like this: * `add $REG, $REG` + `add $REG, $NUMBER` * Add a number, or the contents of another register, to a register. * `call $LABEL` * See [call.asm](call.asm) for an example. * `dec $REG` * Decrement the contents of the specified register. * We also support indirection, so the following work: * `inc byte ptr [$REG]` * `inc word ptr [$REG]` * `inc dword ptr [$REG]` * `inc qword ptr [$REG]` * `inc $REG` * Increment the contents of the specified register. * We also support indirection, so the following work: * `inc byte ptr [$REG]` * `inc word ptr [$REG]` * `inc dword ptr [$REG]` * `inc qword ptr [$REG]` * `jmp $LABEL`, `je $LABEL`, `jne $LABEL` * We support jumping instructions, but only with -127/+128 byte displacements * See [jmp.asm](jmp.asm) for a simple example. * `mov $REG, $NUMBER` * `mov $REG, $REG` * Move a number into the specified register. * `nop` * Do nothing. * `push $NUMBER`, or `push $IDENTIFIER` * `ret` * Return from call. * **NOTE**: We don't actually support making calls, though that can be emulated via `push` - see [jmp.asm](jmp.asm) for an example. * `sub $REG, $REG` + `sub $REG, $NUMBER` * Subtract a number, or the contents of another register, from a register. * `xor $REG, $REG` * Set the given register to be zero. * `int $NUM` * Call the kernel. * Processor (flag) control instructions: * `clc`, `cld`, `cli`, `cmc`, `stc`, `std`, and `sti`. Note that we really only support the following registers, you'll see that we only support the 64-bit registers (which means `rax` is supported but `eax`, `ax`, `ah`, and `al` are specifically __not__ supported): * `rax` * `rcx` * `rdx` * `rbx` * `rsp` * `rbp` * `rsi` * `rdi` There is _some_ support for the extended registers `r8`-`r15`, but this varies on a per-instruction basis and should not be relied upon. There is support for storing fixed-data within our program, and locating that. See [hello.asm](hello.asm) for an example of that. We also have some other (obvious) limitations: * There is notably no support for comparison instructions, and jumping instructions. * We _emulate_ (unconditional) jump instructions via "`push`" and "`ret`", see [jmp.asm](jmp.asm) for an example of that. * The entry-point is __always__ at the beginning of the source. * You can only reference data AFTER it has been declared. * These are added to the `data` section of the generated binary, but must be defined first. * See [hello.asm](hello.asm) for an example of that. ## Installation If you have this repository cloned locally you can build the assembler like so: cd cmd/assembler go build . go install . If you wish to fetch and install via your existing toolchain: go get -u github.com/skx/assembler/cmd/assembler You can repeat for the other commands if you wish: go get -u github.com/skx/assembler/cmd/lexer go get -u github.com/skx/assembler/cmd/parser Of course these binary-names are very generic, so perhaps better to work locally! ## Example Usage Build the assembler: $ cd cmd/assembler $ go build . Compile the [sample program](test.asm), and execute it showing the return-code: $ cmd/assembler/assembler test.asm && ./a.out ; echo $? 9 Or run the [hello.asm](hello.asm) example: $ cmd/assembler/assembler hello.in && ./a.out Hello, world Goodbye, world You'll note that the `\n` character was correctly expanded into a newline. # Internals The core of our code consists of a small number of simple packages: * A simple tokenizer [lexer/lexer.go](lexer/lexer.go) * A simple parser [parser/parser.go](parser/parser.go) * This populates a simple internal-form/AST [parser/ast.go](parser/ast.go). * A simple compiler [compiler/compiler.go](compiler/compiler.go) * A simple elf-generator [elf/elf.go](elf/elf.go) * Taken from [vishen/go-x64-executable](https://github.com/vishen/go-x64-executable/). In addition to the package modules we also have a couple of binaries: * `cmd/lexer` * Show the output of lexing a program. * This is useful for debugging and development-purposes, it isn't expected to be useful to end-users. * `cmd/parser` * Show the output of parsing a program. * This is useful for debugging and development-purposes, it isn't expected to be useful to end-users. * `cmd/assembler` * Assemble a program, producing an executable binary. These commands located beneath `cmd` each operate the same way. They each take a single argument which is a file containing assembly-language instructions. For example here is how you'd build and test the parser: cd cmd/parser go build . $ ./parser ../../test.asm &{{INSTRUCTION xor} [{REGISTER rax} {REGISTER rax}]} &{{INSTRUCTION inc} [{REGISTER rax}]} &{{INSTRUCTION mov} [{REGISTER rbx} {NUMBER 0x0000}]} &{{INSTRUCTION mov} [{REGISTER rcx} {NUMBER 0x0007}]} &{{INSTRUCTION add} [{REGISTER rbx} {REGISTER rcx}]} &{{INSTRUCTION mov} [{REGISTER rcx} {NUMBER 0x0002}]} &{{INSTRUCTION add} [{REGISTER rbx} {REGISTER rcx}]} &{{INSTRUCTION int} [{NUMBER 0x80}]} ## Adding New Instructions This is how you might add a new instruction to the assembler, for example you might add `jmp 0x00000` or some similar instruction: * Add a new entry for the instruction in [instructions/instructions.go](instructions/instructions.go) * i.e. Update `InstructionLengths` map to add the instruction. * This will be used by both the tokenization process, and the parser. * Generate the appropriate output in `compiler/compiler.go`, inside the function `compileInstruction`. * i.e. Emit the binary-code for the instruction. ## Debugging Generated Binaries Launch the binary under gdb: $ gdb ./a.out Start it: (gdb) starti Starting program: /home/skx/Repos/github.com/skx/assembler/a.out Program stopped. 0x00000000004000b0 in ?? () Dissassemble: (gdb) x/5i $pc Or show string-contents at an address: (gdb) x/s 0x400000 # Bugs? Feel free to report, as this is more a proof of concept rather than a robust tool they are to be expected. Specifically we're missing support for many instructions, but I hope the code generated for those that is present is correct. Steve

近期下载者

相关文件


收藏者