vLLM and High-Performance Inference: Memory Optimization, Parallel Execution, Token Streaming, and Scalable Model Serving: 2 (Large Language Model Refinement and Inference Series)
Independently published
Pages: 183, Paperback, Independently published
Compare prices (1 shop)
| shop | Price | Action |
|---|---|---|
|
|
13,99 GBP | Go to shop |
