Initial duckdb skill with HTTP API
This commit is contained in:
120
SKILL.md
Normal file
120
SKILL.md
Normal file
@@ -0,0 +1,120 @@
|
||||
---
|
||||
name: duckdb
|
||||
description: DuckDB embedded analytical database with HTTP API
|
||||
metadata:
|
||||
version: "1.0.0"
|
||||
vibestack:
|
||||
main: false
|
||||
---
|
||||
|
||||
# DuckDB Skill
|
||||
|
||||
[DuckDB](https://duckdb.org/) - fast in-process analytical database with a simple HTTP API.
|
||||
|
||||
## Features
|
||||
|
||||
- Embedded OLAP database (no separate server process)
|
||||
- Query CSV, Parquet, JSON files directly
|
||||
- SQL interface via HTTP API
|
||||
- Persistent storage option
|
||||
- Auto-registers with Caddy if present
|
||||
|
||||
## Configuration
|
||||
|
||||
| Variable | Description | Default |
|
||||
|----------|-------------|---------|
|
||||
| `DUCKDB_PORT` | HTTP API port | `8432` |
|
||||
| `DUCKDB_DATABASE` | Database file path | `:memory:` |
|
||||
| `DUCKDB_DATA_DIR` | Directory for data files | `/data/duckdb` |
|
||||
| `DUCKDB_DOMAIN` | Domain for Caddy auto-config | (none) |
|
||||
| `DUCKDB_READ_ONLY` | Read-only mode | `false` |
|
||||
|
||||
## HTTP API
|
||||
|
||||
### Execute Query
|
||||
|
||||
```bash
|
||||
# Simple query
|
||||
curl -X POST http://localhost:8432/query \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"sql": "SELECT 1 + 1 AS result"}'
|
||||
|
||||
# Query CSV file
|
||||
curl -X POST http://localhost:8432/query \
|
||||
-d '{"sql": "SELECT * FROM read_csv_auto(\"/data/duckdb/sales.csv\") LIMIT 10"}'
|
||||
|
||||
# Query Parquet file
|
||||
curl -X POST http://localhost:8432/query \
|
||||
-d '{"sql": "SELECT * FROM read_parquet(\"/data/duckdb/events.parquet\")"}'
|
||||
```
|
||||
|
||||
### Response Format
|
||||
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"columns": ["result"],
|
||||
"rows": [[2]],
|
||||
"row_count": 1,
|
||||
"time_ms": 0.5
|
||||
}
|
||||
```
|
||||
|
||||
## Use Cases
|
||||
|
||||
### Analytics on Log Data
|
||||
|
||||
```sql
|
||||
-- Query JSON logs
|
||||
SELECT
|
||||
json_extract_string(line, '$.level') as level,
|
||||
count(*) as count
|
||||
FROM read_json_auto('/var/log/supervisor/*.log')
|
||||
GROUP BY level;
|
||||
```
|
||||
|
||||
### Query Remote Data
|
||||
|
||||
```sql
|
||||
-- Query remote Parquet (S3, HTTP)
|
||||
SELECT * FROM read_parquet('https://example.com/data.parquet');
|
||||
|
||||
-- Query remote CSV
|
||||
SELECT * FROM read_csv_auto('https://example.com/data.csv');
|
||||
```
|
||||
|
||||
### Create Persistent Tables
|
||||
|
||||
```sql
|
||||
-- Create table
|
||||
CREATE TABLE events AS
|
||||
SELECT * FROM read_parquet('/data/duckdb/events.parquet');
|
||||
|
||||
-- Query table
|
||||
SELECT date_trunc('hour', timestamp) as hour, count(*)
|
||||
FROM events
|
||||
GROUP BY 1 ORDER BY 1;
|
||||
```
|
||||
|
||||
## CLI Access
|
||||
|
||||
```bash
|
||||
# Interactive shell
|
||||
duckdb /data/duckdb/analytics.db
|
||||
|
||||
# One-off query
|
||||
duckdb -c "SELECT * FROM 'data.csv' LIMIT 5"
|
||||
```
|
||||
|
||||
## Extensions
|
||||
|
||||
DuckDB supports extensions for additional functionality:
|
||||
|
||||
```sql
|
||||
-- Install and load extensions
|
||||
INSTALL httpfs;
|
||||
LOAD httpfs;
|
||||
|
||||
-- Now query S3/HTTP directly
|
||||
SELECT * FROM read_parquet('s3://bucket/data.parquet');
|
||||
```
|
||||
Reference in New Issue
Block a user