CVE-2025-62164

EUVD-2025-198314

21.11.2025, 02:15

vLLM is an inference and serving engine for large language models (LLMs). From versions 0.10.2 to before 0.11.1, a memory corruption vulnerability could lead to a crash (denial-of-service) and potentially remote code execution (RCE), exists in the Completions API endpoint. When processing user-supplied prompt embeddings, the endpoint loads serialized tensors using torch.load() without sufficient validation. Due to a change introduced in PyTorch 2.8.0, sparse tensor integrity checks are disabled by default. As a result, maliciously crafted tensors can bypass internal bounds checks and trigger an out-of-bounds memory write during the call to to_dense(). This memory corruption can crash vLLM and potentially lead to code execution on the server hosting vLLM. This issue has been patched in version 0.11.1.

Provider	Type	Base Score	Atk. Vector	Atk. Complexity	Priv. Required	Vector
NIST	Primary	8.8 HIGH	NETWORK	LOW	LOW	CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

Base Score

CVSS 3.x

EPSS Score

Percentile: 53%

Affected Products (NVD)

Vendor	Product	Version
vllm	vllm	0.10.2 ≤ 𝑥 < 0.11.1
vllm	vllm	0.11.1:rc0
vllm	vllm	0.11.1:rc1

𝑥

= Vulnerable software versions

Common Weakness Enumeration

CWE-20 - Improper Input Validation
The product receives input or data, but it does not validate or incorrectly validates that the input has the properties that are required to process the data safely and correctly.

References

https://github.com/vllm-project/vllm/commit/58fab50d82838d5014f4a14d991fdb9352c9c84b

https://github.com/vllm-project/vllm/pull/27204

https://github.com/vllm-project/vllm/security/advisories/GHSA-mrw7-hf4f-83pf