go-gpt-3-encoder

所属分类:GPT/ChatGPT
开发工具:GO
文件大小:560KB
下载次数:0
上传日期:2023-03-12 18:53:58
上 传 者sh-1993
说明:  用于GPT2和GPT3的Go BPE令牌化器(编码器+解码器)
(Go BPE tokenizer (Encoder+Decoder) for GPT2 and GPT3 ,)

文件列表:
LICENSE (1070, 2023-03-13)
Makefile (1044, 2023-03-13)
encoder.go (7871, 2023-03-13)
encoder.json (1042301, 2023-03-13)
encoder_test.go (10162, 2023-03-13)
go.mod (377, 2023-03-13)
go.sum (2057, 2023-03-13)
utils.go (317, 2023-03-13)
vocab.bpe (456318, 2023-03-13)

# go-gpt-3-encoder Go BPE tokenizer (Encoder+Decoder) for GPT2 and GPT3. ## About GPT2 and GPT3 use byte pair encoding to turn text into a series of integers to feed into the model. This is a Go implementation of OpenAI's original Python encoder/decoder which can be found [here](https://github.com/openai/gpt-2/blob/master/src/encoder.py). This code was inspired by [Javascript implementation](https://github.com/latitudegames/GPT-3-Encoder) and partially generated by OpenAI himself! ## Install ```bash go get github.com/samber/go-gpt-3-encoder ``` ## Usage Compatible with Node >= 12 ```go import tokenizer "github.com/samber/go-gpt-3-encoder" encoder, err := tokenizer.NewEncoder() if err != nil { log.Fatal(err) } str := "This is an example sentence to try encoding out on!" encoded, err := encoder.Encode(str) if err != nil { log.Fatal(err) } fmt.Println("We can look at each token and what it represents:") for _, token := range encoded { fmt.Printf("%d -- %s\n", token, encoder.Decode([]int{token})) } decoded := encoder.Decode(encoded) fmt.Printf("We can decode it back into: %s\n", decoded) ``` ## Contribute Some corner cases are not covered by this library. See `@TODO` in tests.

近期下载者

相关文件


收藏者