aboutsummaryrefslogtreecommitdiffstats
path: root/CLAUDE.md
blob: a20247a649a3067e141bfa98c0173309c12bcf09 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
# mkwheels

A standalone bash CLI that builds a **reproducible, pinned Python wheels
tarball** for a given package + version. Primary use: vendoring a Python
project's full dependency tree into a SlackBuild so the build installs into a
venv from local wheels with no network (the feroxbuster vendored-crates
pattern, applied to Python).

## Layout

```
mkwheels          # the whole CLI (single-file bash)
selftest          # reproducibility check (both modes, asserts md5 match)
LICENSE           # GPLv2 full text
README.md         # user-facing usage + rationale
docs/superpowers/ # design spec + implementation plan
```

Single-file script by design. Keep it that way unless the tool genuinely
outgrows one file.

## Invocation

Two subcommands; all options are explicit flags, no positionals.

```
mkwheels pypi --name PKG --ver VER [--epoch N]
mkwheels gh   --repo OWNER/REPO --ver VER [--name PKG] [--tag TAG] [--epoch N]
```

- `--ver` / `--tag` strip a single leading `v`; the output version is always
  without `v`. Output: `<name>-wheels-<ver>.tar.gz` + `requirements.txt`.
- `--epoch` optional in both modes; omitted → auto-derived (with a warning):
  - `pypi`: earliest file's `upload_time_iso_8601` from the PyPI JSON.
  - `gh`: the GitHub release `published_at` for the tag.
- `gh` defaults: `--name` = repo basename lowercased; `--tag` = normalized
  `--ver`; the real ref is resolved by trying `<tag>` then `v<tag>`.
- `OUTPUT` env var — output dir (default: `$PWD`).

## How it works

1. Arg parse (mode selector + flags) + required-tool check (`python3`+pip,
   `jq`, `curl`, `tar`, `gzip`, `md5sum`).
2. Mode resolution sets name, epoch, and how `wheels/` is populated:
   - `pypi`: epoch from PyPI JSON; `pip download <name>==<ver>` (pre-built
     wheels, deterministic).
   - `gh`: resolve release ref + `published_at`; download+unpack the tagged
     source; `pip wheel <src_dir>` builds the project **and all deps** (PyPI +
     `git+` deps) to wheels. `pip download <dir>` is wrong here — it only
     resolves metadata and leaves the local project unmaterialized.
3. Emit pinned + hashed `requirements.txt` (audit record only, not the install
   input).
4. Pack a byte-reproducible `.tar.gz`: sorted entries, `--mtime=@epoch`,
   `--owner=0 --group=0 --numeric-owner`, `gzip -n`.

## Reproducibility

This is the whole point. The same inputs + epoch MUST yield a byte-identical
tarball. The tar normalization (step 4) plus `set -o pipefail` (so a `tar`
failure can't be masked by `gzip` exiting 0) are what guarantees it. In `gh`
mode the project is built from source, so reproducibility holds per-machine
(build once on the target platform, upload, pin md5); wheels with compiled
extensions may differ across toolchains.

**Git-sourced deps** (packages whose upstream pins a git URL, e.g. NetExec's
impacket) are frozen at download time: `pip download` resolves whatever is
current, and the tarball, once built, is the source of truth. The
`requirements.txt` records the exact resolved versions.

## Conventions

- `bash`, no third-party Python packages (the tool only drives `pip`).
- `set -eu` + `set -o pipefail`; temp workdir removed via `trap` on EXIT.
- GPLv2 (v2-only): per-file header, `Copyright (C) <year> Danilo M.
  <danix@danix.xyz>`, License section in README.
- Commits GPG-signed. Work directly on master (small project).

## Testing

`./selftest` — builds twice with a fixed epoch in both modes (`pypi` six,
`gh` pyparsing) and asserts each pair of tarballs is byte-identical. Run it
after any change to the tar/packing or mode-resolution logic. Needs network
(pypi.org, github.com). No test framework.

## Maintainer

danix — danix@danix.xyz