LLM router - proxies to provider skills (claude, openai, ollama)

2026-02-03 00:09:06 +01:00
commit 1d9de9d770
4 changed files with 551 additions and 0 deletions
--- a/SKILL.md
+++ b/SKILL.md
@@ -0,0 +1,141 @@
+---
+name: llm
+description: LLM router that proxies to provider skills (claude, openai, ollama)
+metadata:
+  version: "1.0.0"
+  vibestack:
+    main: false
+---
+
+# LLM Skill
+
+Unified LLM router that proxies requests to provider-specific skills. Abstracts away which LLM backend is being used.
+
+## Architecture
+
+```
+┌─────────────┐     ┌─────────────┐
+│   client    │────▶│     llm     │ (router)
+└─────────────┘     └──────┬──────┘
+                           │
+        ┌──────────────────┼──────────────────┐
+        ▼                  ▼                  ▼
+┌───────────────┐  ┌───────────────┐  ┌───────────────┐
+│ claude skill  │  │ openai skill  │  │ ollama skill  │
+│ localhost:8888│  │ localhost:8889│  │ localhost:11434
+└───────────────┘  └───────────────┘  └───────────────┘
+```
+
+## Configuration
+
+### Environment Variables
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `LLM_PORT` | `8082` | Router port |
+| `LLM_PROVIDER` | `claude` | Active provider: `claude`, `openai`, `ollama` |
+| `CLAUDE_URL` | `http://localhost:8888` | Claude skill URL |
+| `OPENAI_URL` | `http://localhost:8889` | OpenAI skill URL |
+| `OLLAMA_URL` | `http://localhost:11434` | Ollama URL |
+| `MEMORY_URL` | (none) | Memory skill URL for conversation persistence |
+
+## API
+
+### WebSocket Chat
+
+Connect to `ws://localhost:8082/chat` for unified chat interface.
+
+**Send message:**
+```json
+{
+  "type": "message",
+  "content": "Hello!",
+  "session_id": "optional-session-id"
+}
+```
+
+**Receive:**
+```json
+{"type": "start", "session_id": "abc123"}
+{"type": "token", "content": "Hello"}
+{"type": "token", "content": "!"}
+{"type": "end"}
+```
+
+### REST API
+
+```bash
+# Chat (proxied to provider)
+curl http://localhost:8082/chat \
+  -H "Content-Type: application/json" \
+  -d '{"message": "Hello!"}'
+
+# Execute (one-shot, proxied to provider)
+curl http://localhost:8082/execute \
+  -H "Content-Type: application/json" \
+  -d '{"prompt": "List all files"}'
+
+# Health check
+curl http://localhost:8082/health
+
+# Get current provider
+curl http://localhost:8082/provider
+```
+
+## Provider Skills
+
+Each provider skill implements its own API. The LLM router translates:
+
+### Claude Skill (port 8888)
+- `POST /chat` - `{"message": "...", "session_id": "..."}`
+- `POST /execute` - `{"prompt": "..."}`
+
+### OpenAI Skill (port 8889)
+- `POST /v1/chat/completions` - OpenAI format
+
+### Ollama (port 11434)
+- `POST /api/chat` - Ollama format
+
+## Switching Providers
+
+```bash
+# Use Claude (default)
+LLM_PROVIDER=claude
+
+# Use OpenAI
+LLM_PROVIDER=openai
+
+# Use Ollama
+LLM_PROVIDER=ollama
+```
+
+Clients connect to `localhost:8082` - they don't need to know which provider is active.
+
+## Tool Calling (Pass-through)
+
+Tools are passed to the provider skill. When the LLM wants to call a tool:
+
+1. LLM router sends tool definitions to provider
+2. Provider returns tool call request
+3. Router passes tool call to client via WebSocket
+4. Client executes tool, sends result back
+5. Router forwards result to provider
+6. Provider continues conversation
+
+```json
+// Client receives
+{"type": "tool_call", "name": "read_file", "arguments": {"path": "/etc/hosts"}}
+
+// Client sends back
+{"type": "tool_result", "name": "read_file", "result": "127.0.0.1 localhost..."}
+```
+
+## Conversation Memory
+
+If `MEMORY_URL` is set, conversations are stored:
+
+```bash
+MEMORY_URL=http://localhost:8081
+```
+
+Each conversation is saved to the memory skill for later retrieval.