In 1997, a flaw was discovered in how Linux and Windows handled IP fragmentation, a Denial-of-Service vulnerability which allowed systems to be crashed remotely.
Background: IP Fragmentation
IP Fragmentation occurs when a packet passes through a link whose MTU is smaller than the packet size; it gets fragmented into multiple smaller packets, and subsequently reassembled by the receiver.
The data in these fragments would typically be non-overlapping, but since partial overlap is permitted by the protocol, IP implementations need to handle that case.
Re-assembling Fragmented Packets
When Linux reassembled fragmented packets, it would loop over each fragment, copying the payload of each fragment into a newly-allocated buffer. In order to handle the case where a fragment overlaps with the previous fragment, if its start offset was less than the end offset of the previous fragment, then its start offset would be moved forward to eliminate the overlap. Here’s a diagram to help visualize this:
0 22 40 v v v |-----Fragment--1-----| |-----Fragment--2-----| ^ \ 19 \ Overlap here becomes... 0 22 40 v v v |-----Fragment--1-----| |---Fragment--2---| \ \ Overlap discarded
And here’s how this was implemented (with only relevant code shown):
/*
* Determine the position of this fragment.
*/
end = offset + ntohs(iph->tot_len) - ihl;
...
/*
* We found where to put this one.
* Check for overlap with preceding fragment, and, if needed,
* align things so that any overlaps are eliminated.
*/
if (prev != NULL && offset < prev->end)
{
i = prev->end - offset;
offset += i; /* ptr into datagram */
ptr += i; /* ptr into fragment data */
}
...
/* Fill in the structure. */
fp->offset = offset;
fp->end = end;
fp->len = end - offset;
offset
is the fragment’s offset in the complete datagram. So if offset
is before the end of the previous fragment, then the middle if
block above sets it to the value of prev->end
, eliminating the overlap. Finally, length of the fragment is updated based on the new offset
and end
of the fragment.
Spotting the Vulnerability
Whenever we’re using arithmetic to compute a size or length, we need to ask ourselves: can this turn out negative?
fp->len = end - offset;
At first glance, it seems like the answer is "no" for this fragment: end
is defined by adding the fragment’s payload length (definitely >= 0
) to the value of offset
, so it would seem that end - offset >= 0
must be true.
However, this analysis ignores what might have happened in between the assignment of end
and the evaluation of end - offset
; in particular, the handling of overlap. offset
effectively gets set to prev->end
(the end of the previous fragment), but still, this fragment should at worst be a duplicate of the previous one, meaning prev->end == end
, and consequently end - offset == 0
. So after a second look, this might still look problem-free.
The dangerous word here is "should". Suppose a pair of fragments comes in that look more like this:
0 22 v v |-----Fragment--1-----| |--Fragment--2--| ^ ^ 4 20
This is obviously wrong and shouldn’t happen; fragments might overlap or duplicate on another, but they should never be a strict subset of another fragment. Maybe so, but can they, anyway? It turns out the answer is yes, they can, for example if they’re deliberately crafted that way by a malicious sender.
If the length of the second fragment is not enough to "cover" the overlap correction, then the updated offset
of the fragment becomes greater than its end
, and consequently the computed fp->len
is negative. This length eventually gets passed to memcpy()
, causing a buffer overflow and an extremely large copy that will lock up or crash the system.
What Went Wrong?
The root cause here was arguably a misplaced trust in untrusted input; it’s a very easy mistake to make as a developer, since we naturally expect that all parts of a system want to work correctly (even the parts that we don’t control). In this case, input is coming from a completely untrusted source, and while there’s a whole class of inputs which a robust sender should never send, there’s nothing which prevents a sender from doing so.
Given that the input can contain a fragment which is a strict subset of the previous fragment, the implementation needs to guard against this possibility, for example with a if (offset > end)
check.
Takeaways
- Whenever untrusted data is being processed, it should be treated with the mindset of an attacker: consider not only the expected format of the input, but rather all possible formats in could take.
- Whenever a length or size is being computed, particularly in a memory-unsafe language like C, verify that it’s not possible for the value to be negative or otherwise invalid.
- When working with data ranges, carefully consider the edge cases, such as:
- Are the ranges always ordered by their start index?
- Can zero-length ranges exist?
- Must the ranges be contiguous? (Can gaps exist between them?)
- Can the ranges overlap? If so, are all types of overlap handled? (Adjacent, partial overlap, duplicate, strict subset)