LLM Inference in C++: Building High-Throughput Engines with PagedAttention and CUDA Kernels (High-Performance C++ Engineering)

LLM Inference in C++: Building High-Throughput Engines with PagedAttention and CUDA Kernels (High-Performance C++ Engineering)

Independently published

Pages: 287, Hardcover, Independently published

Compare prices (1 shop)

Sort: Price: lowest first Price: highest first

shop	Price	Action
	25,52 GBP	Go to shop

Similar products

LLM Inference in C++: Building High-Throughput Engines with PagedAttention and CUDA Kernels (High-Performance C++ Engineering)
From 22,61 EUR
CUDA C++ Debugging: Safer GPU Kernel Programming (Generative AI LLM Programming)
From 6,95 EUR
CUDA C++ Optimization: Coding Faster GPU Kernels (Generative AI LLM Programming)
From 5,95 EUR
Rust Programming for AI and CUDA: Master High-Performance Machine Learning with Safe GPU Kernels, Inference, and Scalable Training
From 20,39 EUR
Qwen 3.5 AI Agents on GPU and CUDA: The Engineer's Guide to Mastering Hardware Sizing, Local LLM Inference, Optimize VRAM, Building and Scaling Native Multimodal AI in Production
From 19,27 EUR
AI Systems Performance Engineering : Optimizing Model Training and Inference Workloads with Gpus, Cuda, and Pytorch
From 60,85 EUR