LLM Inference in C++: Building High-Throughput Engines with PagedAttention and CUDA Kernels (High-Performance C++ Engineering)

LLM Inference in C++: Building High-Throughput Engines with PagedAttention and CUDA Kernels (High-Performance C++ Engineering)

Independently published

Pages: 287, Hardcover, Independently published

Compare prices (1 shop)

shop Price Action
25,52 GBP Go to shop

Similar products