FP8 on Learn by Tanhdev

FP8 on Learn by Tanhdevhttps://learn.tanhdev.com/tags/fp8/Recent content in FP8 on Learn by TanhdevHugoen-usSun, 17 May 2026 12:00:00 +0700Tối Ưu Hóa Inference & Triển Khai vLLM Trên Productionhttps://learn.tanhdev.com/series/ai-data-engineering-pipeline/part-8-inference-optimization-vllm/Sun, 17 May 2026 12:00:00 +0700https://learn.tanhdev.com/series/ai-data-engineering-pipeline/part-8-inference-optimization-vllm/Vượt qua giới hạn VRAM và tối ưu chi phí Server khi triển khai LLM 70B với vLLM, PagedAttention và Quantization FP8/AWQ.