ari

所属分类:编程语言基础
开发工具:Rust
文件大小:0KB
下载次数:0
上传日期:2023-06-23 16:46:23
上 传 者sh-1993
说明:  一种以类型为中心的纯函数式编程语言,设计用于输入二进制文件。在git上维护的代码镜像...
(A type-centred purely functional programming language designed to type binary files. Mirror of code maintained at gitlab.com.)

文件列表:
.envrc (45, 2023-12-28)
COPYING (35149, 2023-12-28)
Cargo.lock (6218, 2023-12-28)
Cargo.toml (312, 2023-12-28)
ari.svg (25873, 2023-12-28)
flake.lock (3635, 2023-12-28)
flake.nix (2888, 2023-12-28)
nix/ (0, 2023-12-28)
nix/packages/ (0, 2023-12-28)
nix/packages/ari.nix (1020, 2023-12-28)
src/ (0, 2023-12-28)
src/ast.rs (15331, 2023-12-28)
src/lib.rs (157, 2023-12-28)
src/natural.rs (3256, 2023-12-28)
src/parser.rs (7737, 2023-12-28)
tests/ (0, 2023-12-28)
tests/parser.rs (244, 2023-12-28)
tests/parser/ (0, 2023-12-28)
tests/parser/labels.rs (6819, 2023-12-28)
tests/parser/naturals.rs (2341, 2023-12-28)
tests/parser/resolved_references.rs (8959, 2023-12-28)
tests/parser/sexprs.rs (2113, 2023-12-28)
tests/parser/symbols.rs (2671, 2023-12-28)
tests/parser/unresolved_references.rs (4297, 2023-12-28)

# ari A type-centred [purely functional programming](https://en.wikipedia.org/wiki/Purely_functional_programming) language designed to type binary files. Funding for Ari is provided by [NLnet](https://nlnet.nl) and the European Commission through the [NGI Assure](https://www.assure.ngi.eu) initiative. ## Why does this exist? ### To make binary files more accessible Most binary files require specially designed tools to read & write, but ari types are designed so that you can decompose any binary file down into its component parts. The language is intended to be a foundation for other tools to build on top of. Some of the tools we plan on building with ari include: #### arie An editor specially designed to edit files using ari types. We'll also build plugins for existing text editors. #### ariq A command line tool to query components of a file using ari types. #### aric A command line tool to compile ari types into other languages using [ari map types](#exponentiate-and-map-expressions). #### arid A command line tool to structurally diff files of the same ari type. #### ariz A command line tool to compress files & directories using ari types. Ari's type system is designed to act as a guide for [arithmetic coding](https://en.wikipedia.org/wiki/Arithmetic_coding). ### ... but also, not just binary formats While the primary focus is to help interpret binary data, ari is also designed to model grammars for text-based languages. Here is a quick comparison of how some formal language concepts map into ari: > **NOTE:** This comparison is meant for people who are already > familiar with these concepts. > > If you want to get a better understanding of what they mean, or what > the ari equivalents mean, see the [language > reference](#language-reference).
RegexAri
```regex a|b|c ``` ```lisp (| "a" "b" "c") ```
```regex abc ``` ```lisp "abc" ``` or ```lisp (* "a" "b" "c") ```
```regex a? ``` ```lisp (^ "a" ?) ```
```regex a* ``` ```lisp (^ "a" _) ```
```regex a+ ``` ```lisp (^ "a" !) ```
```regex a{3} ``` ```lisp (^ "a" 3) ```
```regex a{2, 5} ``` ```lisp (^ "a" (.. 2 5)) ``` or ```lisp (^ "a" (| 2 3 4)) ```
```regex . ``` ```lisp codepoint ```
Backus–Naur formAri
```bnf "terminal" ``` ```lisp "terminal" ```
```bnf ``` ```lisp non-terminal ```
```bnf ::= "ab" | "cd" ``` ```lisp :symbol (| "ab" (* "cd" x)) ```
## What makes ari unique? ### Types are the main focus of the language In ari, everything is a type. Types and type expressions are the primary focus of the language. You can add, multiply, and even exponentiate types, much like you can a number in any other programming language.
Ari type expressionsEquivalent types
```lisp :sum (+ a b c) ``` - [enum](https://en.wikipedia.org/wiki/Enumerated_type) - [tagged union](https://en.wikipedia.org/wiki/Tagged_union)
```lisp :product (* a b c) ``` - [record, struct, tuple]() - [class]() - [trait]() - [concept, interface]() - [protocol]()
```lisp :map (^ a b c) ``` - [function type signature](https://en.wikipedia.org/wiki/Type_signature) - [associative array, map, dictionary](https://en.wikipedia.org/wiki/Associative_array) - [array type](https://en.wikipedia.org/wiki/Array_data_type) - [stack](https://en.wikipedia.org/wiki/Stack_%28abstract_data_type%29) - [matrix]() - [vector space](https://en.wikipedia.org/wiki/Vector_space) - [tensor](https://en.wikipedia.org/wiki/Tensor)
See the [language reference](#language-reference) ### Generalizes the idea of primitive types In ari, there are infinitely many primitives types generalized by a single concept, [natural numbers](https://en.wikipedia.org/wiki/Natural_number). We call these primitive types [natural types](#natural-expressions). ...well what exactly does this mean? Say you want to create a `size` type that has 3 possible states, `small`, `medium`, `large`. In ari, you can precisely type the number of states using the `3` type: ```lisp :size 3 ``` The possible values of this type are `0`, `1`, `2`, which you can use to represent `small`, `medium`, `large`. This is a powerful concept, but this small example doesn't fit with the main goal of ari, to make things more accessible. ...this is exactly where [algebraic expressions](#algebraic-expressions) and [labels](#label-expressions) come in! You can break down and label the individual states of a `3` type with a [sum expression](#additive-expressions). For example, this sum expression produces a sum type that's equivalent to the `3` type: ```lisp :size (+ :small 1 :medium 1 :large 1) ``` and the labelled states in this example have a 1:1 correspondence with the `3` type: - `small` = `0` - `medium` = `1` - `large` = `2` The whole idea of ari is that binary data can be interpreted as one giant number, and we provide ways to break big numbers down into smaller numbers... and label them . > **NOTE:** Many other languages can partially model the idea of sum > types with "enums". The main issue though is enums are often > "rounded" to the best matching primitive type(s) (usually some kind > of int type). In ari every possible sum type has an equivalent > natural type, which allows for a level of precision not possible in > other type systems. ### Value bindings and dependent types In ari all types have an implicit binding with a corresponding runtime value. [Dereference expressions `@`](#dereference-expressions) can be used on a type to bring this value from "value-space" into "type-space". Here's an example where they're used to describe a 24-bit RGB colour image with a dynamically sized `width` & `height`: ```lisp :byte (^ :bit 2 :bit-length 8) :my-image-format (* :width byte :height byte :image (^ :pixel (* :r byte :g byte :b byte) @width @height ) ) ``` In this example the `@width` and `@height` types are evaluated from the runtime value of `width` & `height`. Expressions are normally evaluated at compile time, but this pushes the evaluation to runtime. Ari can use the same expressions between compile time & runtime because it's [purely functional](https://en.wikipedia.org/wiki/Functional_programming), meaning that expressions don't depend on any hidden state. ### Labels are optional, but you can also label anything Most languages force you to name things, even when the name is redundant or unnecessary. A lot of these languages have to come up with parallel constructs to match their named counterparts. eg. `functions lambdas`, `structs tuples`... Often there isn't an unnamed alternative. You'll be forced to decouple & name things even when an inlined alternative would be easier to understand. Ari takes a fundamentally different approach. Any expression can be inlined without labelling, but also, all expressions can be labelled. This gives developers flexibility to do whatever makes the most sense. Maybe something is only used once and it would be simpler to just inline it, maybe you don't even know how to name something yet, but you want to define its structure. Maybe you just want to refer to a type nested within another type without decoupling it from its parent. When you can label everything, anything can be a module. ## What ideas does ari steal? Ari steals a bunch of good ideas from various places: Academic Inspirations: - [Type theory](https://en.wikipedia.org/wiki/Type_theory) & [algebraic types](https://en.wikipedia.org/wiki/Algebraic_data_type) - [Group theory](https://en.wikipedia.org/wiki/Group_theory) - [Set theory](https://en.wikipedia.org/wiki/Set_theory) - [Curry-Howard correspondence](https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspondence) - [Arithmetic coding](https://en.wikipedia.org/wiki/Arithmetic_coding) - [Purely functional programming](https://en.wikipedia.org/wiki/Purely_functional_programming) - [Lazy evaluation](https://en.wikipedia.org/wiki/Lazy_evaluation) - [Currying](https://en.wikipedia.org/wiki/Currying) - [Homoiconicity](https://en.wikipedia.org/wiki/Homoiconicity) Language Inspirations: - [Lisp]() ([s-expressions](https://en.wikipedia.org/wiki/S-expression) - ari is not a lisp, homoiconicity, functional-focused) - [Rust](https://www.rust-lang.org) (sum types & other algebraic types) - [Nix](https://nixos.org/manual/nix/stable/expressions/expression-language.html) (purely functional, lazily evaluated, currying) - [Haskell](https://www.haskell.org) (indirectly through Rust & Nix + monads) ## Language reference ### Atomic expressions These are the predefined base expressions of ari, they produce the "atoms" for [symbolic expressions](#symbolic-expressions). #### Natural expressions ```lisp 0 1 2 3 ... ``` Produce the [primitive types](https://en.wikipedia.org/wiki/Primitive_data_type) of the language, modelled after [natural numbers](https://en.wikipedia.org/wiki/Natural_number). Its "value" encodes the number of possible states that can exist in the type. ##### Bottom expression ```lisp 0 ``` Produces the [bottom type `⊥`](https://en.wikipedia.org/wiki/Bottom_type), which has no possible states. You can think of this as the "error type". This is the [identity type](https://en.wikipedia.org/wiki/Identity_element) for [sum expressions](#add-and-sum-expressions): ```lisp (= (+ x 0) x ) ``` There are certain contexts that propagate the bottom type: - [Product expressions](#multiply-and-product-expressions) - The base of [map expressions](#exponentiate-and-map-expressions) This "error propagation" can be "caught" by other expressions. For example: ```lisp (+ 256 (* 256 0)) ``` `(* 256 0)` evaluates to `0`, the bottom type, but `(+ 256 0)` evaluates to `256`... which is not the bottom type. This [sum expression](#add-and-sum-expressions) catches the error! This doesn't _seem_ useful when only working with natural expressions, why not just remove expressions that propagate the bottom type? Well.... types can also be evaluated at runtime with [dereference expressions `@`](#dereference-expressions), and we use this same idea to catch runtime errors. ##### Unit expression ```lisp 1 ``` Produces the [unit type](https://en.wikipedia.org/wiki/Unit_type), which has only one possible state. You can think of this as a type that doesn't actually store any information. This is the [identity type](https://en.wikipedia.org/wiki/Identity_element) for [product expressions](#add-and-sum-expressions): ```lisp (= (* x 1) x ) ``` #### Label expressions ```lisp :label 123 ``` Define a name for the result of an expression in the current scope. #### Symbol expressions ```lisp symbol ``` Produce "symbol types", which are equivalent to whatever type is associated with `symbol` in the current scope. #### Value expressions In ari, all types are implicitly bound to a corresponding runtime value. Value expressions are used to refer to these runtime values. They can only be used in: - [Product expressions](#multiply-and-product-expressions) - The base of [map expressions](#exponentiate-and-map-expressions) > **NOTE:** These are the same contexts that propagate the [bottom > type `0`](#bottom-expression). ##### Reference expressions ```lisp $symbol ``` Produce "reference types", which are bound to the same runtime value of `symbol`, but are equivalent to [`1`](#unit-expression). ##### Dereference expressions ```lisp @symbol ``` When applied to symbol types, produce a type from the runtime value of the symbol type. ```lisp @123 @(+ :small 1 :medium 1 :large) ``` When applied to non-symbol types, produce an "effective type" from the runtime value of the non-symbol type. "Effective types" don't have an associated runtime value. #### Path expressions ```lisp symbol:x:y:z (* :x (* :y (* :z 256))):x:y:z ``` Produce a nested type contained within another type. #### Extended symbol expressions ```ari '(symbol)' ``` Produce symbols that can include non-whitespace special characters. To encode quotes in this expression, double them up: ```ari '''quoted''' ``` > **NOTE:** To get a better understating of how this works, these > expressions only terminate after an odd sequence of quotes. Then all > terminating quotes (except for the last) are interpreted as part of > the symbol. Any characters following whitespace will be treated as documentation: ```lisp :'pixel A 24bit RGB pixel' (* :r 256 :g 256 :b 256) 'pixel Here we're referencing pixel' ``` #### Unicode text expressions ```ari "()" ``` A convenient notation for defining text-based grammars, which is actually just syntax sugar for runtime [assertion expressions](#assertion-expressions) that assert for [Unicode](https://en.wikipedia.org/wiki/Unicode) text. To encode quotes in this expression, double them up: ```ari "{ ""abc"": 123 }" ``` > **NOTE** This follows the same behaviour as [extended symbol > expressions](#extended-symbol-expressions) Text not bound to any particular encoding context will be interpreted as utf-8, but this can be changed with text encoding macros: ```lisp (= (ascii-8 "text") (ascii "text") ) (ascii-7 "text") (utf-16 "text") (utf-32 "text") ``` ##### Codepoint symbol ```lisp codepoint ``` Represents a single Unicode [code point](https://en.wikipedia.org/wiki/Code_point) for text in the current text encoding context. This can be dynamically sized. ##### Grapheme symbol ```lisp grapheme ``` Represents a single Unicode [grapheme](https://en.wikipedia.org/wiki/Grapheme) for text in the current text encoding context. This can be dynamically sized. ### Symbolic expressions ```lisp (sexpr arg1 arg2 arg3) ``` Symbolic expressions are a list of [atomic expressions](#atomic-expressions), separated by whitespace & wrapped by parenthesis. The first element is treated as a function, and the remaining elements are passed as inputs to that function. These are considered "symbolic expressions" because their behaviour depends on the symbols defined in the current scope. These modelled after lisp [s-expressions](https://en.wikipedia.org/wiki/S-expression), but are different in a few ways: - Ari doesn't have [cons cells](https://en.wikipedia.org/wiki/Cons), so symbolic expressions aren't implemented with cons cells. The closest concept to cons cells are [product expressions](#multiply-and-product-expressions). - Symbolic expressions form a lexical scope from the [label expressions](#label-expressions) it contains. This means that "let" expressions are embedded in all symbolic expressions. #### Assertion expressions ```lisp :equal (= a b c) ``` [Asserts]() that all arguments have the same number of possible states. If they do, this evaluates to the last argument, otherwise it evaluates to [`0`](#bottom-expression). #### Algebraic expressions ##### Additive expressions ###### Add and sum expressions ```lisp :add (+ a b) ``` Given natural inputs, produces a [sum type `∑`](https://en.wikipedia.org/wiki/Tagged_union), which is either `a` or `b`. This adds together the number of possible states between the two types. We derive a nary form of add `+` by repeated application of associativity: ```lisp :additive-associativity (= (+ (+ a b) c) (+ a (+ b c) (+ a b c) ) ``` ###### Additive inverse expressions ```lisp :subtract (- a b) ``` Produces an [integer type ``](https://en.wikipedia.org/wiki/Integer) that subtracts the first few `b` states from `a`. Combined with the [additive identity `0`](#bottom-expression), we derive a unary form of subtract `-`: ```lisp :negative (= (- 0 x) (- x) ) ``` ###### Additive group theory Additive expressions form an [abelian group](https://en.wikipedia.org/wiki/Abelian_group) in type-space, and a [non-abelian](https://en.wikipedia.org/wiki/Non-abelian_group) group in value-space. ##### Multiplicative expressions ###### Multiply and product expressions ```lisp :multiply (* a b) ``` Given natural inputs, produces a [product type `Π`](https://en.wikipedia.org/wiki/Product_type), which has both `a` and `b`. This multiplies the number of possible states between the two types. We derive a nary form of multiply `*` by repeated application of associativity: ```lisp :multiplicative-associativity (= (* (* a b) c) (* a (* b c)) (* a b c) ) ``` ###### Multiplicative inverse expressions ```lisp :divide (/ a b) ``` Produces a [rational type ``](https://en.wikipedia.org/wiki/Rational_number) type that divides `b` states out of `a`. Combined with the [multiplicative identity `1`](#unit-expression), we derive a unary form of divide `/`: ```lisp :inverse (= (/ 1 x) (/ x) ) ``` ###### Multiplicative group theory Multiplicative expressions form an [abelian group](https://en.wikipedia.org/wiki/Abelian_group) in type-space, and a [non-abelian](https://en.wikipedia.org/wiki/Non-abelian_group) group in value-space. ##### Exponentiation expressions ###### Exponentiate and map expressions ```lisp :exponentiate (^ b e) ``` Given natural inputs, produces a [map type `→`](https://en.wikipedia.org/wiki/Associative_array), which maps the exponent `e` to the base `b`. This raises the number of possible states in `b` to the power of the number of possible states in `e`. Exponentiation is non-associative, but we derive a nary form of exponentiate `^` through [currying](https://en.wikipedia.org/wiki/Currying). Which is just repeated application of the [power of a power identity](https://en.wikipedia.org/wiki/Exponentiation#Identities_and_properties): ```lisp :power-of-a-power-identity (= (^ (^ a b) c) (^ a (* b c)) (^ a b c) ) ``` ###### Exponentiation left inverse expression ```lisp :logarithm (log b x) ``` Produces a [complex type ](https://en.wikipedia.org/wiki/Complex_number), which treats `x ... ...

近期下载者

相关文件


收藏者