7 posts in total
2026
vLLM Internals — PagedAttention and Custom Accelerator Compilation
Our Cognitive Profile and a Personal Playbook for the Agentic Era
Exporting Compute Graphs, LLM Shape Dynamics, and Serving Runtimes
Jie Liu's B-Exam: Abstractions and Optimizations for Sparse Tensor Computation on Modern Hardware
Critique of "RFSeek and Ye Shall Find: A tool for summary visualization and analysis of RFCs"
Main Takeaways from a Group Discussion on AI Coding
2025
Running Local LLMs with Ollama