aboutsummaryrefslogtreecommitdiffstats
path: root/docs/superpowers/plans/2026-06-26-mkwheels.md
blob: 6197431439b2ea9610ed043b613e31729199a822 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
# mkwheels Implementation Plan

> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.

**Goal:** A standalone bash CLI that produces a reproducible, pinned Python wheels tarball (`<pkg>-wheels-<ver>.tar.gz`) plus a hashed `requirements.txt` for vendoring into SlackBuilds.

**Architecture:** Single bash script drives a throwaway `python3 -m venv` + `pip download` to resolve a package's full dependency tree into wheels, emits a hashed lockfile, and packs the wheels into a byte-reproducible tarball (normalized tar metadata + gzip `-n`, mtime pinned to a SOURCE_DATE_EPOCH that defaults to the PyPI release upload time). A selftest builds `six` twice and asserts the tarballs are byte-identical.

**Tech Stack:** bash, python3 + pip, jq, curl, tar/gzip/md5sum. GPLv2 (v2-only).

---

## File Structure

```
~/Programming/GIT/mkwheels/
├── mkwheels          # the script (single-file bash, executable)
├── selftest          # reproducibility check (bash)
├── LICENSE           # GPLv2 full text
├── README.md         # usage, reproducibility rationale, SBo integration
└── .gitignore        # ignore scratch output (*.tar.gz, requirements.txt at root)
```

Responsibilities:
- `mkwheels` — the whole CLI: arg parse, epoch resolution, venv+download, lockfile, reproducible tar.
- `selftest` — runs `mkwheels six <ver>` twice, asserts md5 of the two tarballs match.
- `LICENSE` / `README.md` — licensing + docs per global preference.

The script is small enough to stay one file; the selftest is separated so the
tool itself carries no test scaffolding.

---

### Task 1: Repo scaffolding (LICENSE, .gitignore, README skeleton)

**Files:**
- Create: `LICENSE`
- Create: `.gitignore`
- Create: `README.md`

- [ ] **Step 1: Add the GPLv2 LICENSE**

Copy the official GPLv2 text into `LICENSE`. Fetch it (already cached at
`/tmp/.../scratchpad/gpl-2.0.txt` during planning; re-fetch if absent):

```bash
curl -fsSL https://www.gnu.org/licenses/old-licenses/gpl-2.0.txt -o LICENSE
head -2 LICENSE   # "GNU GENERAL PUBLIC LICENSE / Version 2, June 1991"
```

- [ ] **Step 2: Add .gitignore**

```
# scratch output from running mkwheels in the repo root
/*.tar.gz
/requirements.txt
```

- [ ] **Step 3: Add README skeleton**

`README.md`:

````markdown
# mkwheels

Build a reproducible, pinned Python wheels tarball for vendoring into a
SlackBuild (or any offline `pip install`). Generic over package + version.

## Usage

```
mkwheels <pkg> <ver> [epoch]
```

- `<pkg> <ver>` — the PyPI package and exact version to vendor.
- `[epoch]` — optional `SOURCE_DATE_EPOCH`. Omitted → auto-derived from the
  PyPI release upload time (a warning is printed). Pass it to override.
- `OUTPUT` env var overrides the output directory (default: current dir).

Outputs `<pkg>-wheels-<ver>.tar.gz` and `requirements.txt` (pinned + hashed).
Prints the md5sum and the resolved epoch.

## Requirements

`bash`, `python3` + `pip`, `jq`, `curl`, `tar`, `gzip`, `md5sum`.

## Reproducibility

PyPI releases are immutable, so the wheel set for a fixed version is
deterministic. The tarball normalizes tar metadata (sorted entries, fixed
mtime/owner, `gzip -n`) so it is byte-identical for the same inputs + epoch.

Git-sourced dependencies (packages whose upstream pins a git URL) are frozen
at download time: `pip download` resolves whatever is current, and the emitted
`requirements.txt` records the exact resolved versions. Once built, the
tarball is the source of truth.

## SBo integration

Run `mkwheels <pkg> <ver>`, upload the tarball to your package host, and set
`DOWNLOAD_x86_64` / `MD5SUM_x86_64` in the SlackBuild `.info` to point at it.
The SlackBuild then `pip install --no-index --find-links=<wheels>` into a venv.

## License

GPLv2 (v2-only). See `LICENSE`. Copyright (C) 2026 Danilo M. <danix@danix.xyz>.
````

- [ ] **Step 4: Commit**

```bash
git add LICENSE .gitignore README.md
git commit -S -m "mkwheels: add LICENSE, gitignore, README skeleton"
```

---

### Task 2: Script skeleton — header, usage, arg parse, tool checks

**Files:**
- Create: `mkwheels`

- [ ] **Step 1: Write the script skeleton**

Create `mkwheels`, `chmod +x` it after. Content:

```bash
#!/bin/bash
# mkwheels — build a reproducible, pinned Python wheels tarball for a package.
#
# Copyright (C) 2026 Danilo M. <danix@danix.xyz>
#
# This program is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License version 2 as published by
# the Free Software Foundation.
#
# This program is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
# FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License along with
# this program; if not, see <https://www.gnu.org/licenses/>.
set -eu

usage() {
    cat <<EOF
usage: ${0##*/} <pkg> <ver> [epoch]

Build a reproducible pinned Python wheels tarball <pkg>-wheels-<ver>.tar.gz
plus a hashed requirements.txt, for vendoring into a SlackBuild.

  <pkg> <ver>  PyPI package name and exact version to vendor.
  [epoch]      SOURCE_DATE_EPOCH for the tarball mtime. Omitted -> auto-derived
               from the PyPI release upload time (a warning is printed).

  OUTPUT       env var: output directory (default: current dir).

Requires: python3+pip, jq, curl, tar, gzip, md5sum.
EOF
}

case "${1:-}" in
    -h|--help) usage; exit 0 ;;
esac
[ $# -ge 2 ] && [ $# -le 3 ] || { usage >&2; exit 2; }

pkg=$1
ver=$2
epoch=${3:-}
OUTPUT=${OUTPUT:-$PWD}

# Check required tools up front.
for tool in python3 jq curl tar gzip md5sum; do
    command -v "$tool" >/dev/null 2>&1 || {
        echo "error: required tool not found: $tool" >&2
        exit 1
    }
done
python3 -m pip --version >/dev/null 2>&1 || {
    echo "error: python3 pip module not available" >&2
    exit 1
}

echo "mkwheels: $pkg $ver -> $OUTPUT/$pkg-wheels-$ver.tar.gz"
```

- [ ] **Step 2: Verify usage and arg validation**

```bash
chmod +x mkwheels
./mkwheels -h            # prints usage, exit 0
./mkwheels; echo $?      # usage to stderr, exit 2
./mkwheels onlyone; echo $?   # exit 2
```
Expected: `-h` prints usage; no-arg and one-arg print usage and exit 2.

- [ ] **Step 3: Commit**

```bash
git add mkwheels
git commit -S -m "mkwheels: add script skeleton with arg parse and tool checks"
```

---

### Task 3: Epoch resolution from PyPI

**Files:**
- Modify: `mkwheels` (append epoch-resolution block after the tool checks)

- [ ] **Step 1: Add the epoch resolution block**

Insert after the `echo "mkwheels: ..."` line:

```bash
# Resolve SOURCE_DATE_EPOCH. Explicit arg wins; otherwise derive it from the
# earliest file upload time of this version on PyPI (a real, reproducible,
# release-tied timestamp).
if [ -z "$epoch" ]; then
    meta=$(curl -fsSL "https://pypi.org/pypi/$pkg/$ver/json") || {
        echo "error: cannot fetch PyPI metadata for $pkg $ver" >&2
        exit 1
    }
    iso=$(printf '%s' "$meta" \
        | jq -r '[.urls[].upload_time_iso_8601] | sort | .[0] // empty')
    [ -n "$iso" ] || {
        echo "error: no upload time found for $pkg $ver on PyPI" >&2
        exit 1
    }
    epoch=$(date -u -d "$iso" +%s)
    echo "warning: epoch not given; using PyPI upload time $iso (epoch $epoch)" >&2
fi
export SOURCE_DATE_EPOCH="$epoch"
```

- [ ] **Step 2: Verify epoch derivation against a known release**

```bash
# six 1.16.0 was uploaded 2021-05-05; check we get a stable epoch and warning.
./mkwheels six 1.16.0 2>&1 | grep -i "using PyPI upload time"
```
Expected: a warning line naming a 2021 ISO timestamp and an epoch integer.
(The run will continue past this point only once Task 4 is implemented; for
now it is fine if it errors after printing the warning.)

- [ ] **Step 3: Verify explicit epoch suppresses the warning**

```bash
./mkwheels six 1.16.0 1620000000 2>&1 | grep -i "using PyPI" && echo UNEXPECTED || echo "ok: no auto-derive"
```
Expected: `ok: no auto-derive`.

- [ ] **Step 4: Commit**

```bash
git add mkwheels
git commit -S -m "mkwheels: resolve SOURCE_DATE_EPOCH from PyPI upload time"
```

---

### Task 4: Download wheels + emit hashed requirements.txt

**Files:**
- Modify: `mkwheels` (append download + lockfile block)

- [ ] **Step 1: Add the temp workdir, venv, download, and lockfile block**

Insert after the epoch block:

```bash
# Throwaway workdir, cleaned on exit.
work=$(mktemp -d)
trap 'rm -rf "$work"' EXIT

wheels="$work/wheels"
mkdir -p "$wheels"

# Isolated build env so host pip config / installed pkgs don't leak in.
python3 -m venv "$work/venv"
"$work/venv/bin/pip" install --quiet --upgrade pip wheel >/dev/null

# Resolve the full tree into $wheels (sdists are built to wheels).
"$work/venv/bin/pip" download "$pkg==$ver" --dest "$wheels"

# Emit a pinned, hashed requirements.txt from the downloaded files. Each
# distribution is pinned to its version with a sha256 hash per file.
req="$work/requirements.txt"
: > "$req"
for f in "$wheels"/*; do
    base=$(basename "$f")
    # name-version from the wheel/sdist filename: split on first two '-' fields
    # wheels: name-version-...; sdists: name-version.tar.gz
    name=${base%%-*}
    rest=${base#*-}
    version=${rest%%-*}
    version=${version%.tar.gz}
    hash=$(python3 -c "import hashlib,sys;print(hashlib.sha256(open(sys.argv[1],'rb').read()).hexdigest())" "$f")
    printf '%s==%s --hash=sha256:%s\n' "$name" "$version" "$hash" >> "$req"
done
sort -o "$req" "$req"
```

- [ ] **Step 2: Verify a small download produces wheels + a hashed lockfile**

```bash
./mkwheels six 1.16.0 1620000000 2>/dev/null || true
# (no output yet — tarball step is Task 5; this just checks it runs without error)
echo $?
```
Expected: exit 0 (download + lockfile build succeed; six is pure-python with
no deps, so exactly one entry would be produced internally).

- [ ] **Step 3: Commit**

```bash
git add mkwheels
git commit -S -m "mkwheels: download wheels and emit hashed requirements.txt"
```

---

### Task 5: Reproducible tar + final output

**Files:**
- Modify: `mkwheels` (append tar/output block)

- [ ] **Step 1: Add the reproducible tar and output block**

Insert after the lockfile block:

```bash
mkdir -p "$OUTPUT"
tarball="$OUTPUT/$pkg-wheels-$ver.tar.gz"

# Reproducible archive: sorted entries, normalized ownership/mtime, gzip -n.
# Run from $work so the archive holds a top-level 'wheels/' dir.
( cd "$work" \
  && find wheels -print0 | LC_ALL=C sort -z \
     | tar --null --files-from=- \
           --mtime="@$SOURCE_DATE_EPOCH" \
           --owner=0 --group=0 --numeric-owner \
           --no-recursion -cf - \
     | gzip -n > "$tarball" )

cp "$work/requirements.txt" "$OUTPUT/requirements.txt"

md5=$(md5sum "$tarball" | cut -d' ' -f1)
echo "wheels tarball: $tarball"
echo "requirements:   $OUTPUT/requirements.txt"
echo "epoch:          $SOURCE_DATE_EPOCH"
echo "md5sum:         $md5"
```

- [ ] **Step 2: Verify a full run emits tarball, lockfile, md5**

```bash
cd /tmp && OUTPUT=/tmp/mkw-test /home/danix/Programming/GIT/mkwheels/mkwheels six 1.16.0 1620000000
ls -l /tmp/mkw-test/six-wheels-1.16.0.tar.gz /tmp/mkw-test/requirements.txt
```
Expected: both files exist; final output prints `md5sum: <hex>` and the
resolved epoch `1620000000`.

- [ ] **Step 3: Verify the lockfile is pinned + hashed**

```bash
cat /tmp/mkw-test/requirements.txt
```
Expected: a line like `six==1.16.0 --hash=sha256:<64 hex>`.

- [ ] **Step 4: Commit**

```bash
git add mkwheels
git commit -S -m "mkwheels: pack reproducible tarball and print md5"
```

---

### Task 6: selftest — byte-identical reproducibility check

**Files:**
- Create: `selftest`

- [ ] **Step 1: Write the selftest**

Create `selftest`, `chmod +x` after:

```bash
#!/bin/bash
# selftest — build six twice and assert the wheels tarballs are byte-identical.
# The smallest check that fails if the reproducible-tar normalization breaks.
set -eu

here=$(cd "$(dirname "$0")" && pwd)
tmp=$(mktemp -d)
trap 'rm -rf "$tmp"' EXIT

# Fixed epoch so both runs use the same mtime (we are testing tar determinism,
# not epoch derivation).
OUTPUT="$tmp/a" "$here/mkwheels" six 1.16.0 1620000000 >/dev/null
OUTPUT="$tmp/b" "$here/mkwheels" six 1.16.0 1620000000 >/dev/null

a=$(md5sum "$tmp/a/six-wheels-1.16.0.tar.gz" | cut -d' ' -f1)
b=$(md5sum "$tmp/b/six-wheels-1.16.0.tar.gz" | cut -d' ' -f1)

if [ "$a" = "$b" ]; then
    echo "PASS: reproducible ($a)"
else
    echo "FAIL: tarballs differ ($a != $b)" >&2
    exit 1
fi
```

- [ ] **Step 2: Run the selftest**

```bash
chmod +x selftest
./selftest
```
Expected: `PASS: reproducible (<md5>)`.

- [ ] **Step 3: Commit**

```bash
git add selftest
git commit -S -m "mkwheels: add selftest asserting byte-identical tarballs"
```

---

### Task 7: README selftest note + final review

**Files:**
- Modify: `README.md` (add a Test section)

- [ ] **Step 1: Add a Test section to the README**

Append to `README.md`:

````markdown
## Test

`./selftest` builds `six` twice with a fixed epoch and asserts the two wheels
tarballs are byte-identical. Run it after changing the tar/packing logic.
````

- [ ] **Step 2: Commit**

```bash
git add README.md
git commit -S -m "mkwheels: document selftest in README"
```

---

## Self-Review notes

- **Spec coverage:** interface (`<pkg> <ver> [epoch]`, OUTPUT) → Task 2;
  epoch auto-derive from PyPI upload_time + warning → Task 3; venv + pip
  download full tree → Task 4; hashed requirements.txt → Task 4; reproducible
  tar (sorted, mtime/owner, gzip -n) → Task 5; error handling (set -eu, trap,
  clear failures) → Tasks 2/3/4; jq used for JSON → Task 3; selftest on six →
  Task 6; GPLv2 LICENSE + header + README License section → Tasks 1/2.
- **Out-of-scope items** (arbitrary requirements.txt input, uploading,
  private indexes, caching) are intentionally not in any task.
- **Name consistency:** `$work`, `$wheels`, `$req`, `$tarball`, `$epoch`,
  `SOURCE_DATE_EPOCH`, `OUTPUT` used consistently across Tasks 2–6.

## Caveat to verify during execution

The `requirements.txt` name/version parse in Task 4 splits on `-` from the
filename. PEP 503 normalizes some names (underscores vs hyphens); for the
netexec tree most filenames are well-behaved, but if a wheel filename's
project name itself contains a hyphen-mapped char the parsed `name` may differ
from the canonical PyPI name. This only affects the cosmetic lockfile, not the
tarball contents or the build (the SlackBuild installs from the wheel files
directly, not by re-resolving the lockfile). If it matters, switch the parse
to read `WHEEL`/`METADATA` instead. Noted, not blocking.