azat/llm

Fork 0

Files

Azat 1d9de9d770 LLM router - proxies to provider skills (claude, openai, ollama)

2026-02-03 00:09:06 +01:00

3.8 KiB

Raw Blame History

name, description, metadata

name

description

metadata

llm

LLM router that proxies to provider skills (claude, openai, ollama)

version

vibestack

1.0.0

main
false

LLM Skill

Unified LLM router that proxies requests to provider-specific skills. Abstracts away which LLM backend is being used.

Architecture

┌─────────────┐     ┌─────────────┐
│   client    │────▶│     llm     │ (router)
└─────────────┘     └──────┬──────┘
                           │
        ┌──────────────────┼──────────────────┐
        ▼                  ▼                  ▼
┌───────────────┐  ┌───────────────┐  ┌───────────────┐
│ claude skill  │  │ openai skill  │  │ ollama skill  │
│ localhost:8888│  │ localhost:8889│  │ localhost:11434
└───────────────┘  └───────────────┘  └───────────────┘

Configuration

Environment Variables

Variable	Default	Description
`LLM_PORT`	`8082`	Router port
`LLM_PROVIDER`	`claude`	Active provider: `claude`, `openai`, `ollama`
`CLAUDE_URL`	`http://localhost:8888`	Claude skill URL
`OPENAI_URL`	`http://localhost:8889`	OpenAI skill URL
`OLLAMA_URL`	`http://localhost:11434`	Ollama URL
`MEMORY_URL`	(none)	Memory skill URL for conversation persistence

API

WebSocket Chat

Connect to ws://localhost:8082/chat for unified chat interface.

Send message:

{
  "type": "message",
  "content": "Hello!",
  "session_id": "optional-session-id"
}

Receive:

{"type": "start", "session_id": "abc123"}
{"type": "token", "content": "Hello"}
{"type": "token", "content": "!"}
{"type": "end"}

REST API

# Chat (proxied to provider)
curl http://localhost:8082/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello!"}'

# Execute (one-shot, proxied to provider)
curl http://localhost:8082/execute \
  -H "Content-Type: application/json" \
  -d '{"prompt": "List all files"}'

# Health check
curl http://localhost:8082/health

# Get current provider
curl http://localhost:8082/provider

Provider Skills

Each provider skill implements its own API. The LLM router translates:

Claude Skill (port 8888)

POST /chat - {"message": "...", "session_id": "..."}
POST /execute - {"prompt": "..."}

OpenAI Skill (port 8889)

POST /v1/chat/completions - OpenAI format

Ollama (port 11434)

POST /api/chat - Ollama format

Switching Providers

# Use Claude (default)
LLM_PROVIDER=claude

# Use OpenAI
LLM_PROVIDER=openai

# Use Ollama
LLM_PROVIDER=ollama

Clients connect to localhost:8082 - they don't need to know which provider is active.

Tool Calling (Pass-through)

Tools are passed to the provider skill. When the LLM wants to call a tool:

LLM router sends tool definitions to provider
Provider returns tool call request
Router passes tool call to client via WebSocket
Client executes tool, sends result back
Router forwards result to provider
Provider continues conversation

// Client receives
{"type": "tool_call", "name": "read_file", "arguments": {"path": "/etc/hosts"}}

// Client sends back
{"type": "tool_result", "name": "read_file", "result": "127.0.0.1 localhost..."}

Conversation Memory

If MEMORY_URL is set, conversations are stored:

MEMORY_URL=http://localhost:8081

Each conversation is saved to the memory skill for later retrieval.

3.8 KiB Raw Blame History