26 posts in total
2026
vLLM Platform System
Local CUDA vLLM Setup for Python-Only Development Using a Precompiled Wheel
Compile NEFF Executables from NKI Kernels
What is S3?
vLLM Internals — PagedAttention and Custom Accelerator Compilation
Exporting Compute Graphs, LLM Shape Dynamics, and Serving Runtimes
Schedules in Machine Learning Computation: What They Are and Who Needs to Know About Them
Important Locations on Jailbroken iOS
Learning MLIR and HLO by Building a Tiny StableHLO-to-LLVM IR Compiler
Using MLIR as a C++ Library with a Relocatable Install