9 posts in total
2026
vLLM Internals — PagedAttention and Custom Accelerator Compilation
Hugging Face Model Repositories: Organization, Semantics, and Portability
Schedules in Machine Learning Computation: What They Are and Who Needs to Know About Them
Exporting Compute Graphs, LLM Shape Dynamics, and Serving Runtimes
PyTorch + CUDA vs. XLA + TPU: Two Execution Models for ML Systems
Main Takeaways from a Group Discussion on AI Coding
Qt, OpenCV, PyTorch: The Central Dogma of GUI CV Applications
2025
Running Local LLMs with Ollama
2023
Understanding the Name, Structure, and Loss Function of the Variational Autoencoder