<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Model Serving on Learn by Tanhdev</title><link>https://learn.tanhdev.com/tags/model-serving/</link><description>Recent content in Model Serving on Learn by Tanhdev</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Tue, 26 May 2026 08:00:00 +0700</lastBuildDate><atom:link href="https://learn.tanhdev.com/tags/model-serving/index.xml" rel="self" type="application/rss+xml"/><item><title>Tối Ưu vLLM Serving: So Sánh AWQ, GPTQ và GGUF</title><link>https://learn.tanhdev.com/series/slm-playbook/part-6-vllm-deployment-evals/</link><pubDate>Tue, 26 May 2026 08:00:00 +0700</pubDate><guid>https://learn.tanhdev.com/series/slm-playbook/part-6-vllm-deployment-evals/</guid><description>Cẩm nang vận hành SLM trên vLLM. So sánh các định dạng lượng tử hóa AWQ, GPTQ, GGUF và thiết lập cấu hình Dynamic LoRA tiết kiệm RAM GPU hiệu quả.</description></item></channel></rss>