LLM Inference in C++: Building High-Throughput Engines with PagedAttention and CUDA Kernels (High-Performance C++ Engineering)

LLM Inference in C++: Building High-Throughput Engines with PagedAttention and CUDA Kernels (High-Performance C++ Engineering)

Independently published

Pages: 282, Paperback, Independently published

Compare prices (1 shop)

shop Price Action
22,61 GBP Go to shop

Similar products