# mkwheels Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** A standalone bash CLI that produces a reproducible, pinned Python wheels tarball (`-wheels-.tar.gz`) plus a hashed `requirements.txt` for vendoring into SlackBuilds. **Architecture:** Single bash script drives a throwaway `python3 -m venv` + `pip download` to resolve a package's full dependency tree into wheels, emits a hashed lockfile, and packs the wheels into a byte-reproducible tarball (normalized tar metadata + gzip `-n`, mtime pinned to a SOURCE_DATE_EPOCH that defaults to the PyPI release upload time). A selftest builds `six` twice and asserts the tarballs are byte-identical. **Tech Stack:** bash, python3 + pip, jq, curl, tar/gzip/md5sum. GPLv2 (v2-only). --- ## File Structure ``` ~/Programming/GIT/mkwheels/ ├── mkwheels # the script (single-file bash, executable) ├── selftest # reproducibility check (bash) ├── LICENSE # GPLv2 full text ├── README.md # usage, reproducibility rationale, SBo integration └── .gitignore # ignore scratch output (*.tar.gz, requirements.txt at root) ``` Responsibilities: - `mkwheels` — the whole CLI: arg parse, epoch resolution, venv+download, lockfile, reproducible tar. - `selftest` — runs `mkwheels six ` twice, asserts md5 of the two tarballs match. - `LICENSE` / `README.md` — licensing + docs per global preference. The script is small enough to stay one file; the selftest is separated so the tool itself carries no test scaffolding. --- ### Task 1: Repo scaffolding (LICENSE, .gitignore, README skeleton) **Files:** - Create: `LICENSE` - Create: `.gitignore` - Create: `README.md` - [ ] **Step 1: Add the GPLv2 LICENSE** Copy the official GPLv2 text into `LICENSE`. Fetch it (already cached at `/tmp/.../scratchpad/gpl-2.0.txt` during planning; re-fetch if absent): ```bash curl -fsSL https://www.gnu.org/licenses/old-licenses/gpl-2.0.txt -o LICENSE head -2 LICENSE # "GNU GENERAL PUBLIC LICENSE / Version 2, June 1991" ``` - [ ] **Step 2: Add .gitignore** ``` # scratch output from running mkwheels in the repo root /*.tar.gz /requirements.txt ``` - [ ] **Step 3: Add README skeleton** `README.md`: ````markdown # mkwheels Build a reproducible, pinned Python wheels tarball for vendoring into a SlackBuild (or any offline `pip install`). Generic over package + version. ## Usage ``` mkwheels [epoch] ``` - ` ` — the PyPI package and exact version to vendor. - `[epoch]` — optional `SOURCE_DATE_EPOCH`. Omitted → auto-derived from the PyPI release upload time (a warning is printed). Pass it to override. - `OUTPUT` env var overrides the output directory (default: current dir). Outputs `-wheels-.tar.gz` and `requirements.txt` (pinned + hashed). Prints the md5sum and the resolved epoch. ## Requirements `bash`, `python3` + `pip`, `jq`, `curl`, `tar`, `gzip`, `md5sum`. ## Reproducibility PyPI releases are immutable, so the wheel set for a fixed version is deterministic. The tarball normalizes tar metadata (sorted entries, fixed mtime/owner, `gzip -n`) so it is byte-identical for the same inputs + epoch. Git-sourced dependencies (packages whose upstream pins a git URL) are frozen at download time: `pip download` resolves whatever is current, and the emitted `requirements.txt` records the exact resolved versions. Once built, the tarball is the source of truth. ## SBo integration Run `mkwheels `, upload the tarball to your package host, and set `DOWNLOAD_x86_64` / `MD5SUM_x86_64` in the SlackBuild `.info` to point at it. The SlackBuild then `pip install --no-index --find-links=` into a venv. ## License GPLv2 (v2-only). See `LICENSE`. Copyright (C) 2026 Danilo M. . ```` - [ ] **Step 4: Commit** ```bash git add LICENSE .gitignore README.md git commit -S -m "mkwheels: add LICENSE, gitignore, README skeleton" ``` --- ### Task 2: Script skeleton — header, usage, arg parse, tool checks **Files:** - Create: `mkwheels` - [ ] **Step 1: Write the script skeleton** Create `mkwheels`, `chmod +x` it after. Content: ```bash #!/bin/bash # mkwheels — build a reproducible, pinned Python wheels tarball for a package. # # Copyright (C) 2026 Danilo M. # # This program is free software; you can redistribute it and/or modify it # under the terms of the GNU General Public License version 2 as published by # the Free Software Foundation. # # This program is distributed in the hope that it will be useful, but WITHOUT # ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS # FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. # # You should have received a copy of the GNU General Public License along with # this program; if not, see . set -eu usage() { cat < [epoch] Build a reproducible pinned Python wheels tarball -wheels-.tar.gz plus a hashed requirements.txt, for vendoring into a SlackBuild. PyPI package name and exact version to vendor. [epoch] SOURCE_DATE_EPOCH for the tarball mtime. Omitted -> auto-derived from the PyPI release upload time (a warning is printed). OUTPUT env var: output directory (default: current dir). Requires: python3+pip, jq, curl, tar, gzip, md5sum. EOF } case "${1:-}" in -h|--help) usage; exit 0 ;; esac [ $# -ge 2 ] && [ $# -le 3 ] || { usage >&2; exit 2; } pkg=$1 ver=$2 epoch=${3:-} OUTPUT=${OUTPUT:-$PWD} # Check required tools up front. for tool in python3 jq curl tar gzip md5sum; do command -v "$tool" >/dev/null 2>&1 || { echo "error: required tool not found: $tool" >&2 exit 1 } done python3 -m pip --version >/dev/null 2>&1 || { echo "error: python3 pip module not available" >&2 exit 1 } echo "mkwheels: $pkg $ver -> $OUTPUT/$pkg-wheels-$ver.tar.gz" ``` - [ ] **Step 2: Verify usage and arg validation** ```bash chmod +x mkwheels ./mkwheels -h # prints usage, exit 0 ./mkwheels; echo $? # usage to stderr, exit 2 ./mkwheels onlyone; echo $? # exit 2 ``` Expected: `-h` prints usage; no-arg and one-arg print usage and exit 2. - [ ] **Step 3: Commit** ```bash git add mkwheels git commit -S -m "mkwheels: add script skeleton with arg parse and tool checks" ``` --- ### Task 3: Epoch resolution from PyPI **Files:** - Modify: `mkwheels` (append epoch-resolution block after the tool checks) - [ ] **Step 1: Add the epoch resolution block** Insert after the `echo "mkwheels: ..."` line: ```bash # Resolve SOURCE_DATE_EPOCH. Explicit arg wins; otherwise derive it from the # earliest file upload time of this version on PyPI (a real, reproducible, # release-tied timestamp). if [ -z "$epoch" ]; then meta=$(curl -fsSL "https://pypi.org/pypi/$pkg/$ver/json") || { echo "error: cannot fetch PyPI metadata for $pkg $ver" >&2 exit 1 } iso=$(printf '%s' "$meta" \ | jq -r '[.urls[].upload_time_iso_8601] | sort | .[0] // empty') [ -n "$iso" ] || { echo "error: no upload time found for $pkg $ver on PyPI" >&2 exit 1 } epoch=$(date -u -d "$iso" +%s) echo "warning: epoch not given; using PyPI upload time $iso (epoch $epoch)" >&2 fi export SOURCE_DATE_EPOCH="$epoch" ``` - [ ] **Step 2: Verify epoch derivation against a known release** ```bash # six 1.16.0 was uploaded 2021-05-05; check we get a stable epoch and warning. ./mkwheels six 1.16.0 2>&1 | grep -i "using PyPI upload time" ``` Expected: a warning line naming a 2021 ISO timestamp and an epoch integer. (The run will continue past this point only once Task 4 is implemented; for now it is fine if it errors after printing the warning.) - [ ] **Step 3: Verify explicit epoch suppresses the warning** ```bash ./mkwheels six 1.16.0 1620000000 2>&1 | grep -i "using PyPI" && echo UNEXPECTED || echo "ok: no auto-derive" ``` Expected: `ok: no auto-derive`. - [ ] **Step 4: Commit** ```bash git add mkwheels git commit -S -m "mkwheels: resolve SOURCE_DATE_EPOCH from PyPI upload time" ``` --- ### Task 4: Download wheels + emit hashed requirements.txt **Files:** - Modify: `mkwheels` (append download + lockfile block) - [ ] **Step 1: Add the temp workdir, venv, download, and lockfile block** Insert after the epoch block: ```bash # Throwaway workdir, cleaned on exit. work=$(mktemp -d) trap 'rm -rf "$work"' EXIT wheels="$work/wheels" mkdir -p "$wheels" # Isolated build env so host pip config / installed pkgs don't leak in. python3 -m venv "$work/venv" "$work/venv/bin/pip" install --quiet --upgrade pip wheel >/dev/null # Resolve the full tree into $wheels (sdists are built to wheels). "$work/venv/bin/pip" download "$pkg==$ver" --dest "$wheels" # Emit a pinned, hashed requirements.txt from the downloaded files. Each # distribution is pinned to its version with a sha256 hash per file. req="$work/requirements.txt" : > "$req" for f in "$wheels"/*; do base=$(basename "$f") # name-version from the wheel/sdist filename: split on first two '-' fields # wheels: name-version-...; sdists: name-version.tar.gz name=${base%%-*} rest=${base#*-} version=${rest%%-*} version=${version%.tar.gz} hash=$(python3 -c "import hashlib,sys;print(hashlib.sha256(open(sys.argv[1],'rb').read()).hexdigest())" "$f") printf '%s==%s --hash=sha256:%s\n' "$name" "$version" "$hash" >> "$req" done sort -o "$req" "$req" ``` - [ ] **Step 2: Verify a small download produces wheels + a hashed lockfile** ```bash ./mkwheels six 1.16.0 1620000000 2>/dev/null || true # (no output yet — tarball step is Task 5; this just checks it runs without error) echo $? ``` Expected: exit 0 (download + lockfile build succeed; six is pure-python with no deps, so exactly one entry would be produced internally). - [ ] **Step 3: Commit** ```bash git add mkwheels git commit -S -m "mkwheels: download wheels and emit hashed requirements.txt" ``` --- ### Task 5: Reproducible tar + final output **Files:** - Modify: `mkwheels` (append tar/output block) - [ ] **Step 1: Add the reproducible tar and output block** Insert after the lockfile block: ```bash mkdir -p "$OUTPUT" tarball="$OUTPUT/$pkg-wheels-$ver.tar.gz" # Reproducible archive: sorted entries, normalized ownership/mtime, gzip -n. # Run from $work so the archive holds a top-level 'wheels/' dir. ( cd "$work" \ && find wheels -print0 | LC_ALL=C sort -z \ | tar --null --files-from=- \ --mtime="@$SOURCE_DATE_EPOCH" \ --owner=0 --group=0 --numeric-owner \ --no-recursion -cf - \ | gzip -n > "$tarball" ) cp "$work/requirements.txt" "$OUTPUT/requirements.txt" md5=$(md5sum "$tarball" | cut -d' ' -f1) echo "wheels tarball: $tarball" echo "requirements: $OUTPUT/requirements.txt" echo "epoch: $SOURCE_DATE_EPOCH" echo "md5sum: $md5" ``` - [ ] **Step 2: Verify a full run emits tarball, lockfile, md5** ```bash cd /tmp && OUTPUT=/tmp/mkw-test /home/danix/Programming/GIT/mkwheels/mkwheels six 1.16.0 1620000000 ls -l /tmp/mkw-test/six-wheels-1.16.0.tar.gz /tmp/mkw-test/requirements.txt ``` Expected: both files exist; final output prints `md5sum: ` and the resolved epoch `1620000000`. - [ ] **Step 3: Verify the lockfile is pinned + hashed** ```bash cat /tmp/mkw-test/requirements.txt ``` Expected: a line like `six==1.16.0 --hash=sha256:<64 hex>`. - [ ] **Step 4: Commit** ```bash git add mkwheels git commit -S -m "mkwheels: pack reproducible tarball and print md5" ``` --- ### Task 6: selftest — byte-identical reproducibility check **Files:** - Create: `selftest` - [ ] **Step 1: Write the selftest** Create `selftest`, `chmod +x` after: ```bash #!/bin/bash # selftest — build six twice and assert the wheels tarballs are byte-identical. # The smallest check that fails if the reproducible-tar normalization breaks. set -eu here=$(cd "$(dirname "$0")" && pwd) tmp=$(mktemp -d) trap 'rm -rf "$tmp"' EXIT # Fixed epoch so both runs use the same mtime (we are testing tar determinism, # not epoch derivation). OUTPUT="$tmp/a" "$here/mkwheels" six 1.16.0 1620000000 >/dev/null OUTPUT="$tmp/b" "$here/mkwheels" six 1.16.0 1620000000 >/dev/null a=$(md5sum "$tmp/a/six-wheels-1.16.0.tar.gz" | cut -d' ' -f1) b=$(md5sum "$tmp/b/six-wheels-1.16.0.tar.gz" | cut -d' ' -f1) if [ "$a" = "$b" ]; then echo "PASS: reproducible ($a)" else echo "FAIL: tarballs differ ($a != $b)" >&2 exit 1 fi ``` - [ ] **Step 2: Run the selftest** ```bash chmod +x selftest ./selftest ``` Expected: `PASS: reproducible ()`. - [ ] **Step 3: Commit** ```bash git add selftest git commit -S -m "mkwheels: add selftest asserting byte-identical tarballs" ``` --- ### Task 7: README selftest note + final review **Files:** - Modify: `README.md` (add a Test section) - [ ] **Step 1: Add a Test section to the README** Append to `README.md`: ````markdown ## Test `./selftest` builds `six` twice with a fixed epoch and asserts the two wheels tarballs are byte-identical. Run it after changing the tar/packing logic. ```` - [ ] **Step 2: Commit** ```bash git add README.md git commit -S -m "mkwheels: document selftest in README" ``` --- ## Self-Review notes - **Spec coverage:** interface (` [epoch]`, OUTPUT) → Task 2; epoch auto-derive from PyPI upload_time + warning → Task 3; venv + pip download full tree → Task 4; hashed requirements.txt → Task 4; reproducible tar (sorted, mtime/owner, gzip -n) → Task 5; error handling (set -eu, trap, clear failures) → Tasks 2/3/4; jq used for JSON → Task 3; selftest on six → Task 6; GPLv2 LICENSE + header + README License section → Tasks 1/2. - **Out-of-scope items** (arbitrary requirements.txt input, uploading, private indexes, caching) are intentionally not in any task. - **Name consistency:** `$work`, `$wheels`, `$req`, `$tarball`, `$epoch`, `SOURCE_DATE_EPOCH`, `OUTPUT` used consistently across Tasks 2–6. ## Caveat to verify during execution The `requirements.txt` name/version parse in Task 4 splits on `-` from the filename. PEP 503 normalizes some names (underscores vs hyphens); for the netexec tree most filenames are well-behaved, but if a wheel filename's project name itself contains a hyphen-mapped char the parsed `name` may differ from the canonical PyPI name. This only affects the cosmetic lockfile, not the tarball contents or the build (the SlackBuild installs from the wheel files directly, not by re-resolving the lockfile). If it matters, switch the parse to read `WHEEL`/`METADATA` instead. Noted, not blocking.