Red Hat Performance and Scale Engineering
Red Hat’s most recent posts about Performance, Scale, Chaos and more.LATEST BLOGSAutoscaling vLLM with OpenShift AI model serving: Performance validationNovember 26, 2025 Alberto PerdomoIn my previous blog, How to set up KServe autoscaling for vLLM with KEDA, we explored the foundational setup of vLLM autoscaling in Open Data Hub (ODH) using KEDA and the custom metrics autoscaler operator. We established the architecture for a scaling strategy that goes beyond traditional CPU and memory metrics, using AI inference-specific service-level indicators (SLI). Now, it’s time to put this system to t
