--- name: duckdb description: DuckDB embedded analytical database with HTTP API metadata: version: "1.0.0" vibestack: main: false --- # DuckDB Skill [DuckDB](https://duckdb.org/) - fast in-process analytical database with a simple HTTP API. ## Features - Embedded OLAP database (no separate server process) - Query CSV, Parquet, JSON files directly - SQL interface via HTTP API - Persistent storage option - Auto-registers with Caddy if present ## Configuration | Variable | Description | Default | |----------|-------------|---------| | `DUCKDB_PORT` | HTTP API port | `8432` | | `DUCKDB_DATABASE` | Database file path | `:memory:` | | `DUCKDB_DATA_DIR` | Directory for data files | `/data/duckdb` | | `DUCKDB_DOMAIN` | Domain for Caddy auto-config | (none) | | `DUCKDB_READ_ONLY` | Read-only mode | `false` | ## HTTP API ### Execute Query ```bash # Simple query curl -X POST http://localhost:8432/query \ -H "Content-Type: application/json" \ -d '{"sql": "SELECT 1 + 1 AS result"}' # Query CSV file curl -X POST http://localhost:8432/query \ -d '{"sql": "SELECT * FROM read_csv_auto(\"/data/duckdb/sales.csv\") LIMIT 10"}' # Query Parquet file curl -X POST http://localhost:8432/query \ -d '{"sql": "SELECT * FROM read_parquet(\"/data/duckdb/events.parquet\")"}' ``` ### Response Format ```json { "success": true, "columns": ["result"], "rows": [[2]], "row_count": 1, "time_ms": 0.5 } ``` ## Use Cases ### Analytics on Log Data ```sql -- Query JSON logs SELECT json_extract_string(line, '$.level') as level, count(*) as count FROM read_json_auto('/var/log/supervisor/*.log') GROUP BY level; ``` ### Query Remote Data ```sql -- Query remote Parquet (S3, HTTP) SELECT * FROM read_parquet('https://example.com/data.parquet'); -- Query remote CSV SELECT * FROM read_csv_auto('https://example.com/data.csv'); ``` ### Create Persistent Tables ```sql -- Create table CREATE TABLE events AS SELECT * FROM read_parquet('/data/duckdb/events.parquet'); -- Query table SELECT date_trunc('hour', timestamp) as hour, count(*) FROM events GROUP BY 1 ORDER BY 1; ``` ## CLI Access ```bash # Interactive shell duckdb /data/duckdb/analytics.db # One-off query duckdb -c "SELECT * FROM 'data.csv' LIMIT 5" ``` ## Extensions DuckDB supports extensions for additional functionality: ```sql -- Install and load extensions INSTALL httpfs; LOAD httpfs; -- Now query S3/HTTP directly SELECT * FROM read_parquet('s3://bucket/data.parquet'); ```