Tags - reference - Jifeng Wu's Personal Website

06-07

vLLM Platform System

05-31

How the KV Cache Works in HuggingFace Transformers

05-23

Rust Crates and Python Packages

05-19

Local CUDA vLLM Setup for Python-Only Development Using a Precompiled Wheel

05-17

Recording Audio on Linux with PulseAudio

05-14

Compile NEFF Executables from NKI Kernels

05-13

Type Theory Concepts: A to Z

05-13

What is S3?

05-12

vLLM Internals — PagedAttention and Custom Accelerator Compilation

05-04

Hugging Face Model Repositories: Organization, Semantics, and Portability