blog

Spot the Vulnerability: Data Ranges and Untrusted Input

In 1997, a flaw was discovered in how Linux and Windows handled IP fragmentation, a Denial-of-Service vulnerability which allowed systems to be crashed remotely.

Background: IP Fragmentation

IP Fragmentation occurs when a packet passes through a link whose MTU is smaller than the packet size; it gets fragmented into multiple smaller packets, and subsequently reassembled by the receiver.

The data in these fragments would typically be non-overlapping, but since partial overlap is permitted by the protocol, IP implementations need to handle that case.

Re-assembling Fragmented Packets

When Linux reassembled fragmented packets, it would loop over each fragment, copying the payload of each fragment into a newly-allocated buffer. In order to handle the case where a fragment overlaps with the previous fragment, if its start offset was less than the end offset of the previous fragment, then its start offset would be moved forward to eliminate the overlap. Here’s a diagram to help visualize this:

0                    22                40
v                     v                 v
|-----Fragment--1-----|
                  |-----Fragment--2-----|
                  ^  \
                 19   \
                       Overlap here
becomes...
0                    22                40
v                     v                 v
|-----Fragment--1-----|
                      |---Fragment--2---|
                    \
                     \
                      Overlap discarded

And here’s how this was implemented (with only relevant code shown):

/*
 *      Determine the position of this fragment.
 */
end = offset + ntohs(iph->tot_len) - ihl;
...
/*
 *      We found where to put this one.
 *      Check for overlap with preceding fragment, and, if needed,
 *      align things so that any overlaps are eliminated.
 */
if (prev != NULL && offset < prev->end)
{
        i = prev->end - offset;
        offset += i;    /* ptr into datagram */
        ptr += i;       /* ptr into fragment data */
}
...
/* Fill in the structure. */
fp->offset = offset;
fp->end = end;
fp->len = end - offset;

offset is the fragment’s offset in the complete datagram. So if offset is before the end of the previous fragment, then the middle if block above sets it to the value of prev->end, eliminating the overlap. Finally, length of the fragment is updated based on the new offset and end of the fragment.

Spotting the Vulnerability

Whenever we’re using arithmetic to compute a size or length, we need to ask ourselves: can this turn out negative?

fp->len = end - offset;

At first glance, it seems like the answer is "no" for this fragment: end is defined by adding the fragment’s payload length (definitely >= 0) to the value of offset, so it would seem that end - offset >= 0 must be true.

However, this analysis ignores what might have happened in between the assignment of end and the evaluation of end - offset; in particular, the handling of overlap. offset effectively gets set to prev->end (the end of the previous fragment), but still, this fragment should at worst be a duplicate of the previous one, meaning prev->end == end, and consequently end - offset == 0. So after a second look, this might still look problem-free.

The dangerous word here is "should". Suppose a pair of fragments comes in that look more like this:

0                    22
v                     v
|-----Fragment--1-----|
    |--Fragment--2--|
    ^               ^
    4              20

This is obviously wrong and shouldn’t happen; fragments might overlap or duplicate on another, but they should never be a strict subset of another fragment. Maybe so, but can they, anyway? It turns out the answer is yes, they can, for example if they’re deliberately crafted that way by a malicious sender.

If the length of the second fragment is not enough to "cover" the overlap correction, then the updated offset of the fragment becomes greater than its end, and consequently the computed fp->len is negative. This length eventually gets passed to memcpy(), causing a buffer overflow and an extremely large copy that will lock up or crash the system.

What Went Wrong?

The root cause here was arguably a misplaced trust in untrusted input; it’s a very easy mistake to make as a developer, since we naturally expect that all parts of a system want to work correctly (even the parts that we don’t control). In this case, input is coming from a completely untrusted source, and while there’s a whole class of inputs which a robust sender should never send, there’s nothing which prevents a sender from doing so.

Given that the input can contain a fragment which is a strict subset of the previous fragment, the implementation needs to guard against this possibility, for example with a if (offset > end) check.

Takeaways

Whenever untrusted data is being processed, it should be treated with the mindset of an attacker: consider not only the expected format of the input, but rather all possible formats in could take.
Whenever a length or size is being computed, particularly in a memory-unsafe language like C, verify that it’s not possible for the value to be negative or otherwise invalid.
When working with data ranges, carefully consider the edge cases, such as:
- Are the ranges always ordered by their start index?
- Can zero-length ranges exist?
- Must the ranges be contiguous? (Can gaps exist between them?)
- Can the ranges overlap? If so, are all types of overlap handled? (Adjacent, partial overlap, duplicate, strict subset)

Spot the Vulnerability: Loops and Terminating Conditions

by Adam Singleton | Jan 7, 2022 | Developer Blog, Home Display

Spot the Vulnerability: Loops and Terminating Conditions In memory-unsafe languages like C, special care must be taken when copying untrusted data, particularly when copying it to another buffer. In this post, we\'ll spot and mitigate a past vulnerability in Linux\'s...

Accurate Timing

by Adam Singleton | Sep 24, 2021 | Developer Blog, Home Display

In many tasks we need to do something at given intervals of time. The most obvious ways may not give you the best results. Time? Meh. The most basic tasks that don't have what you might call CPU-scale time requirements can be handled with the usual language and...

Exploring Dependent Types in Idris

by Adam Singleton | Aug 27, 2021 | Developer Blog, Home Display

When I'm not coding the "impossible" at Art+Logic, I take a lot of interest in new programming technologies and paradigms; even if they're not yet viable for use in production, there can often be takeaways for improving your everyday code. My current...

« Older Entries

Spot the Vulnerability: Data Ranges and Untrusted Input

Background: IP Fragmentation

Re-assembling Fragmented Packets

Spotting the Vulnerability

What Went Wrong?

Takeaways

Recent posts

Categories

Spot the Vulnerability: Loops and Terminating Conditions

Accurate Timing

Exploring Dependent Types in Idris

Solutions

About