lmrouter

所属分类:云原生工具
开发工具:GO
文件大小:0KB
下载次数:0
上传日期:2024-03-29 07:47:39
上 传 者sh-1993
说明:  实验语言模型路由器和负载均衡器
(experimental language model router and load balancer)

文件列表:
agent/
hub/
message/
tests/
Dockerfile
go.mod
go.sum
main.go

# lmrouter Just like [AI Horde](https://stablehorde.net/) but specifically for low-latency streaming text generation using ephemeral inference servers. ## Usage ```sh # Build the project go build . # Run the server ./lmrouter server --listen :9090 # Run the agent ./lmrouter agent --hub ws://localhost:9090 --inference http://localhost:5000 ``` ## How it works ![diagram](.github/images/diagram.png) lmrouter consists of two components, server and agent. Server acts as the hub, and will route incoming inference requests to any available agent. Agents will run in the inference server close to where the inference API is hosted. The agents will be making an outbound websocket connection to the server, so there is no need to port forward agent nodes. ## Features Implemented: - `/v1/completions` endpoint - `/v1/models` endpoint - SSE streaming for completions endpoint - Automatic selection of agent based on available models To-do: - `/v1/chat/completions` endpoint

近期下载者

相关文件


收藏者