CVE-2025-49847

EUVD-2025-18632

17.06.2025, 20:15

llama.cpp is an inference of several LLM models in C/C++. Prior to version b5662, an attacker‐supplied GGUF model vocabulary can trigger a buffer overflow in llama.cpp’s vocabulary‐loading code. Specifically, the helper _try_copy in llama.cpp/src/vocab.cpp: llama_vocab::impl::token_to_piece() casts a very large size_t token length into an int32_t, causing the length check (if (length < (int32_t)size)) to be bypassed. As a result, memcpy is still called with that oversized size, letting a malicious model overwrite memory beyond the intended buffer. This can lead to arbitrary memory corruption and potential code execution. This issue has been patched in version b5662.

Provider	Type	Base Score	Atk. Vector	Atk. Complexity	Priv. Required	Vector
NIST	Primary	8.8 HIGH	NETWORK	LOW	NONE	CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

Awaiting analysis

This vulnerability is currently awaiting analysis.

Base Score

CVSS 3.x

EPSS Score

Percentile: 35%

Ubuntu Releases

Ubuntu Product

Codename

llama.cpp

jammy	dne
noble	dne
oracular	dne
plucky	dne
questing	needs-triage
resolute	needs-triage

Common Weakness Enumeration

CWE-119 - Improper Restriction of Operations within the Bounds of a Memory Buffer
The software performs operations on a memory buffer, but it can read from or write to a memory location that is outside of the intended boundary of the buffer.

References

https://github.com/ggml-org/llama.cpp/commit/3cfbbdb44e08fd19429fed6cc85b982a91f0efd5

https://github.com/ggml-org/llama.cpp/security/advisories/GHSA-8wwf-w4qm-gpqr