vLLM and High-Performance Inference: Memory Optimization, Parallel Execution, Token Streaming, and Scalable Model Serving: 2 (Large Language Model Refinement and Inference Series)

vLLM and High-Performance Inference: Memory Optimization, Parallel Execution, Token Streaming, and Scalable Model Serving: 2 (Large Language Model Refinement and Inference Series)

Independently published

Pages: 183, Paperback, Independently published

Compare prices (1 shop)

shop Price Action
13,99 GBP Go to shop

Similar products