diff options
| author | Danilo M. <danix@danix.xyz> | 2026-05-11 20:23:52 +0200 |
|---|---|---|
| committer | Danilo M. <danix@danix.xyz> | 2026-05-11 20:23:52 +0200 |
| commit | 5f0710065f3696d83163909192208b3324439fbd (patch) | |
| tree | 5a46ab04a0b413434d357e0340ef7033eeea7f24 /CLAUDE.md | |
| download | ollama-runpod-5f0710065f3696d83163909192208b3324439fbd.tar.gz ollama-runpod-5f0710065f3696d83163909192208b3324439fbd.zip | |
Initial commit: runpod-session.sh with README and CLAUDE.md
Bash script to manage RunPod Ollama pod lifecycle for opencode:
spin up / resume pod, wait for Ollama, patch opencode.jsonc baseURL,
warm up models into VRAM. Includes per-GPU confirmation prompt and
automatic fallback on SUPPLY_CONSTRAINT errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Diffstat (limited to 'CLAUDE.md')
| -rw-r--r-- | CLAUDE.md | 51 |
1 files changed, 51 insertions, 0 deletions
diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..4b1b242 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,51 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## What this is + +Single bash script (`runpod-session.sh`) that manages RunPod GPU pod lifecycle for running Ollama models, then patches `~/.config/opencode/opencode.jsonc` so opencode points at the live pod. + +## Running / testing + +```bash +# Check syntax +bash -n runpod-session.sh + +# Lint +shellcheck runpod-session.sh + +# Run (requires RUNPOD_API_KEY in ~/.config/runpod-session/config) +./runpod-session.sh --status +./runpod-session.sh --model qwen3-coder:latest +./runpod-session.sh --stop +./runpod-session.sh --all-models +./runpod-session.sh --new --gpu-type "RTX PRO 6000" --max-price 1.50 +``` + +## Architecture + +All logic is in `main()` which runs these steps in order: + +1. **Existing pod check** — queries RunPod GraphQL API, matches pod by name containing "ollama" +2. **Pod creation** — if none found: picks cheapest secure-cloud GPU under `MAX_PRICE_PER_HR`, creates pod with `ollama/ollama:latest` image, network volume mounted at `/workspace` +3. **Wait for Ollama** — polls `https://<pod-id>-11434.proxy.runpod.net/api/tags` until `.models` appears +4. **Patch opencode.json** — updates `baseURL`, active `model`, and merges `WARMUP_MODELS` into provider models block using jq `*` merge (existing config wins on conflicts) +5. **Warmup** — POST to `/api/generate` with a dummy prompt to load model into VRAM at `WARMUP_NUM_CTX` context length +6. **Save state** — writes `~/.config/runpod-session/state.json` with pod_id, url, model, timestamp + +## Key design constraints + +- All RunPod calls go through `gql()` — single curl wrapper that exits on API errors +- Pod is identified by name matching `test("ollama"; "i")` — not by ID — so the name `ollama-session` set at creation must not change +- `patch_opencode_config()` writes a `.bak` before touching opencode.jsonc; jq `*` merge means existing per-model settings survive +- `OLLAMA_MODELS_PATH` env var on the pod is not set by the script — must be set in config if models live outside default location on the network volume +- GPU selection only queries `secureCloud == true` pods; community cloud is excluded + +## Config and state files + +| Path | Purpose | +|------|---------| +| `~/.config/runpod-session/config` | Sourced as bash; holds API key, defaults | +| `~/.config/runpod-session/state.json` | Last session record (pod_id, url, model, timestamp) | +| `~/.config/opencode/opencode.jsonc` | Patched in-place; `.bak` written before changes | |
