CLAUDE.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## What this is

Single bash script (`runpod-session.sh`) that manages RunPod GPU pod lifecycle for running Ollama models, then patches `~/.config/opencode/opencode.jsonc`, `transart.py`, and `my-publisher/config.toml` so all tools point at the live pod.

## Running / testing

```bash
# Check syntax
bash -n runpod-session.sh

# Lint
shellcheck runpod-session.sh

# Run (requires RUNPOD_API_KEY in ~/.config/runpod-session/config)
./runpod-session.sh --status
./runpod-session.sh --model qwen3-coder:latest
./runpod-session.sh --stop
./runpod-session.sh --all-models
./runpod-session.sh --new --gpu-type "RTX PRO 6000" --max-price 1.50
```

## Architecture

All logic is in `main()` which runs these steps in order:

1. **Existing pod check** — queries RunPod GraphQL API, matches pod by name containing "ollama"
2. **Pod creation** — if none found: picks cheapest secure-cloud GPU under `MAX_PRICE_PER_HR`, creates pod with `ollama/ollama:latest` image, network volume mounted at `/workspace`
3. **Wait for Ollama** — polls `https://<pod-id>-11434.proxy.runpod.net/api/tags` until `.models` appears
4. **Patch opencode.json** — updates `baseURL`, active `model`, and merges `WARMUP_MODELS` into provider models block using jq `*` merge (existing config wins on conflicts)
5. **Patch external configs** — `patch_external_configs()` rewrites `OLLAMA_HOST` in `transart.py` and `ollama_host` in `my-publisher/config.toml` using `sed`; paths set via `TRANSART_SCRIPT` / `PUBLISHER_CONFIG` in config; skipped if empty or file missing
6. **Warmup** — POST to `/api/generate` with a dummy prompt to load model into VRAM at `WARMUP_NUM_CTX` context length
7. **Save state** — writes `~/.config/runpod-session/state.json` with pod_id, url, model, timestamp

## Key design constraints

- All RunPod calls go through `gql()` — single curl wrapper that exits on API errors
- Pod is identified by name matching `test("ollama"; "i")` — not by ID — so the name `ollama-session` set at creation must not change
- `patch_opencode_config()` writes a `.bak` before touching opencode.jsonc; jq `*` merge means existing per-model settings survive
- `patch_external_configs()` uses `sed -i` on the bare pod URL (no `/v1`); writes `.bak` before each file; skips silently when var is empty, warns when path set but missing
- `OLLAMA_MODELS_PATH` env var on the pod is not set by the script — must be set in config if models live outside default location on the network volume
- GPU selection only queries `secureCloud == true` pods; community cloud is excluded

## Config and state files

| Path | Purpose |
|------|---------|
| `~/.config/runpod-session/config` | Sourced as bash; holds API key, defaults |
| `~/.config/runpod-session/state.json` | Last session record (pod_id, url, model, timestamp) |
| `~/.config/opencode/opencode.jsonc` | Patched in-place; `.bak` written before changes |
| `$TRANSART_SCRIPT` | `OLLAMA_HOST` rewritten on each session start; `.bak` written first |
| `$PUBLISHER_CONFIG` | `ollama_host` rewritten on each session start; `.bak` written first |