LLM Inference in C++: Building High-Throughput Engines with PagedAttention and CUDA Kernels (High-Performance C++ Engineering)
Independently published
Pages: 287, Hardcover, Independently published
Compare prices (1 shop)
| shop | Price | Action |
|---|---|---|
|
|
25,52 GBP | Go to shop |
