3.8 KiB
3.8 KiB
name, description, metadata
| name | description | metadata | ||||||
|---|---|---|---|---|---|---|---|---|
| llm | LLM router that proxies to provider skills (claude, openai, ollama) |
|
LLM Skill
Unified LLM router that proxies requests to provider-specific skills. Abstracts away which LLM backend is being used.
Architecture
┌─────────────┐ ┌─────────────┐
│ client │────▶│ llm │ (router)
└─────────────┘ └──────┬──────┘
│
┌──────────────────┼──────────────────┐
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ claude skill │ │ openai skill │ │ ollama skill │
│ localhost:8888│ │ localhost:8889│ │ localhost:11434
└───────────────┘ └───────────────┘ └───────────────┘
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
LLM_PORT |
8082 |
Router port |
LLM_PROVIDER |
claude |
Active provider: claude, openai, ollama |
CLAUDE_URL |
http://localhost:8888 |
Claude skill URL |
OPENAI_URL |
http://localhost:8889 |
OpenAI skill URL |
OLLAMA_URL |
http://localhost:11434 |
Ollama URL |
MEMORY_URL |
(none) | Memory skill URL for conversation persistence |
API
WebSocket Chat
Connect to ws://localhost:8082/chat for unified chat interface.
Send message:
{
"type": "message",
"content": "Hello!",
"session_id": "optional-session-id"
}
Receive:
{"type": "start", "session_id": "abc123"}
{"type": "token", "content": "Hello"}
{"type": "token", "content": "!"}
{"type": "end"}
REST API
# Chat (proxied to provider)
curl http://localhost:8082/chat \
-H "Content-Type: application/json" \
-d '{"message": "Hello!"}'
# Execute (one-shot, proxied to provider)
curl http://localhost:8082/execute \
-H "Content-Type: application/json" \
-d '{"prompt": "List all files"}'
# Health check
curl http://localhost:8082/health
# Get current provider
curl http://localhost:8082/provider
Provider Skills
Each provider skill implements its own API. The LLM router translates:
Claude Skill (port 8888)
POST /chat-{"message": "...", "session_id": "..."}POST /execute-{"prompt": "..."}
OpenAI Skill (port 8889)
POST /v1/chat/completions- OpenAI format
Ollama (port 11434)
POST /api/chat- Ollama format
Switching Providers
# Use Claude (default)
LLM_PROVIDER=claude
# Use OpenAI
LLM_PROVIDER=openai
# Use Ollama
LLM_PROVIDER=ollama
Clients connect to localhost:8082 - they don't need to know which provider is active.
Tool Calling (Pass-through)
Tools are passed to the provider skill. When the LLM wants to call a tool:
- LLM router sends tool definitions to provider
- Provider returns tool call request
- Router passes tool call to client via WebSocket
- Client executes tool, sends result back
- Router forwards result to provider
- Provider continues conversation
// Client receives
{"type": "tool_call", "name": "read_file", "arguments": {"path": "/etc/hosts"}}
// Client sends back
{"type": "tool_result", "name": "read_file", "result": "127.0.0.1 localhost..."}
Conversation Memory
If MEMORY_URL is set, conversations are stored:
MEMORY_URL=http://localhost:8081
Each conversation is saved to the memory skill for later retrieval.