Files
llm/SKILL.md

3.8 KiB

name, description, metadata
name description metadata
llm LLM router that proxies to provider skills (claude, openai, ollama)
version vibestack
1.0.0
main
false

LLM Skill

Unified LLM router that proxies requests to provider-specific skills. Abstracts away which LLM backend is being used.

Architecture

┌─────────────┐     ┌─────────────┐
│   client    │────▶│     llm     │ (router)
└─────────────┘     └──────┬──────┘
                           │
        ┌──────────────────┼──────────────────┐
        ▼                  ▼                  ▼
┌───────────────┐  ┌───────────────┐  ┌───────────────┐
│ claude skill  │  │ openai skill  │  │ ollama skill  │
│ localhost:8888│  │ localhost:8889│  │ localhost:11434
└───────────────┘  └───────────────┘  └───────────────┘

Configuration

Environment Variables

Variable Default Description
LLM_PORT 8082 Router port
LLM_PROVIDER claude Active provider: claude, openai, ollama
CLAUDE_URL http://localhost:8888 Claude skill URL
OPENAI_URL http://localhost:8889 OpenAI skill URL
OLLAMA_URL http://localhost:11434 Ollama URL
MEMORY_URL (none) Memory skill URL for conversation persistence

API

WebSocket Chat

Connect to ws://localhost:8082/chat for unified chat interface.

Send message:

{
  "type": "message",
  "content": "Hello!",
  "session_id": "optional-session-id"
}

Receive:

{"type": "start", "session_id": "abc123"}
{"type": "token", "content": "Hello"}
{"type": "token", "content": "!"}
{"type": "end"}

REST API

# Chat (proxied to provider)
curl http://localhost:8082/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello!"}'

# Execute (one-shot, proxied to provider)
curl http://localhost:8082/execute \
  -H "Content-Type: application/json" \
  -d '{"prompt": "List all files"}'

# Health check
curl http://localhost:8082/health

# Get current provider
curl http://localhost:8082/provider

Provider Skills

Each provider skill implements its own API. The LLM router translates:

Claude Skill (port 8888)

  • POST /chat - {"message": "...", "session_id": "..."}
  • POST /execute - {"prompt": "..."}

OpenAI Skill (port 8889)

  • POST /v1/chat/completions - OpenAI format

Ollama (port 11434)

  • POST /api/chat - Ollama format

Switching Providers

# Use Claude (default)
LLM_PROVIDER=claude

# Use OpenAI
LLM_PROVIDER=openai

# Use Ollama
LLM_PROVIDER=ollama

Clients connect to localhost:8082 - they don't need to know which provider is active.

Tool Calling (Pass-through)

Tools are passed to the provider skill. When the LLM wants to call a tool:

  1. LLM router sends tool definitions to provider
  2. Provider returns tool call request
  3. Router passes tool call to client via WebSocket
  4. Client executes tool, sends result back
  5. Router forwards result to provider
  6. Provider continues conversation
// Client receives
{"type": "tool_call", "name": "read_file", "arguments": {"path": "/etc/hosts"}}

// Client sends back
{"type": "tool_result", "name": "read_file", "result": "127.0.0.1 localhost..."}

Conversation Memory

If MEMORY_URL is set, conversations are stored:

MEMORY_URL=http://localhost:8081

Each conversation is saved to the memory skill for later retrieval.