From ebb26eac2948e02def3c7ac1ac23c4ecd345a5a7 Mon Sep 17 00:00:00 2001 From: "Danilo M." Date: Fri, 3 Apr 2026 18:17:29 +0200 Subject: repo: flatten layout — move packages to root, extras to .extras/ MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Move all packages from SlackBuilds/ to repo root - Move hooks/, docs/, nvchecker.toml to .extras/ - Update CLAUDE.md and README.md to reflect new structure Co-Authored-By: Claude Sonnet 4.6 --- llama.cpp-vulkan/README | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) create mode 100644 llama.cpp-vulkan/README (limited to 'llama.cpp-vulkan/README') diff --git a/llama.cpp-vulkan/README b/llama.cpp-vulkan/README new file mode 100644 index 0000000..5509d44 --- /dev/null +++ b/llama.cpp-vulkan/README @@ -0,0 +1,22 @@ +llama.cpp + +LLM inference in C/C++ + +The main goal of llama.cpp is to enable LLM inference with minimal +setup and state-of-the-art performance on a wide range of hardware +locally and in the cloud. + + - Plain C/C++ implementation without any dependencies + - Apple silicon is a first-class citizen - optimized via ARM NEON, + Accelerate and Metal frameworks + - AVX, AVX2, AVX512 and AMX support for x86 architectures + - RVV, ZVFH, ZFH, ZICBOP and ZIHINTPAUSE support for RISC-V + architectures + - 1.5-bit, 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit integer + quantization for faster inference and reduced memory use + - Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for + AMD GPUs via HIP and Moore Threads GPUs via MUSA) + - Vulkan and SYCL backend support + - CPU+GPU hybrid inference to partially accelerate models larger than + the total VRAM capacity + -- cgit v1.2.3