Open Bug 1252200 Opened 8 years ago Updated 1 year ago

libjpeg-turbo: DoS via small Image with large Dimensions

Categories

(Core :: Graphics: ImageLib, defect, P3)

defect

Tracking

()

People

(Reporter: jaas, Unassigned)

References

Details

(Keywords: csectype-dos, sec-low, Whiteboard: [gfx-noted])

Attachments

(1 file)

When a jpeg file does not provide enough data, the buffer is filled with zero bits (function jpeg_fill_bit_buffer() in jdhuff.c). A probable goal of this behavior is to make the display of incomplete/corrupted images possible. An attacker can exploit this to cause a memory exhaustion on the system. This leads to a OOM kill4 of the application responsible for opening the malicious image. It is possible to create a 102-byte file expandable to 760884499 bytes of data. (Note that a use of higher dimensions has even led to 12GB result, though the file was ignored by relevant applications). By including the image several times in a website, it was possible to get Firefox closed by the Linux OOM killer. Similarly, the Xorg-Server5 was killed during the tests. All in all, this case leads to any window application being closed, thus signifying a possibility of data loss with regard to unsaved data.

The following perl script generates a file with the dimensions of 0x4040 * 0x3c3c pixels. No image data is provided, which makes it possible to cause a high memory usage without providing much data.

[Will attach perl script]

The decoder throws a JWRN_HIT_MARKER warning (Corrupt JPEG data: premature end of data segment) but continues decoding the file. It uses zero bits leading to a high memory exhaustion, which generally depends on the given dimensions.

Test1: Display the image several times on a web page:

for($i=0; $i<40; $i++)
{
    echo '<img width="100px" height="100px" src="oom.jpg?'.$i.'">;
}

Test2: Keep refreshing the page with only a few images:

<html>
    <head>
    </head>
    <body>
        <script>
            setInterval(function(){  location.reload();  }, 200);
        </script>
        <img src="oom.jpg">
        <img src="oom.jpg">
        <img src="oom.jpg">
    </body>
</html>

This issue was reproduced on an up-to-date ArchLinux6, running on both a Laptop with 8GB memory, and a PC with 32GB memory. Further tests succeeded to verify the vulnerability on Ubuntu 14.04, Android/Chrome and ios/Safari. Exploitation (i.e. crashing the browser) was found to be more difficult with swap space enabled.
Its is recommended to treat the JWRN_HIT_MARKER warning as an error and abort the decompression in case that too much data is missing in relation to the image dimensions specified.
NOTE:  The bug description above is largely cribbed from a recent security audit report by Cure53.  I asked Josh to post it as a Mozilla bug in order to get input from other developers, since I don't quite know how to solve it upstream without breaking backward compatibility with the libjpeg API.  The bug affects Firefox (see below.)

LJT-01-004:  Pathological JPEG causes decoder to use huge amounts of memory (affects libjpeg and all derivatives thereof)

Much the same could be said about this issue as was said about LJT-01-003 (https://bugzilla.mozilla.org/show_bug.cgi?id=1252196).  It's again a case in which the default behavior of the underlying libjpeg API is to ignore warnings rather than treat them as errors.  On the surface, this issue seems to be specific to libjpeg-turbo, whereas 1252196 can be readily reproduced with libjpeg.  However, this is only because, in libjpeg-turbo 1.4.x, the Huffman decoder began using the default Huffman tables if a particular JPEG file didn't include such tables (this was to allow motion-JPEG frames to be decompressed.)  Thus, when attempting to decode the JPEG file that the Cure53 Perl script generates, libjpeg produces the following error:

  Huffman table 0x00 was not defined

However, it is straightforward to generate a similar bogus JPEG image that contains Huffman tables (see test004.c attached), and this new image will readily cause the same OOM issue to occur in libjpeg.

I have no idea how to fix this.  As with 1252196, fixing this issue would probably require changes to the underlying algorithms, because changing the default libjpeg API behavior such that it treats warnings as errors would create a backward incompatibility.

To elaborate on why making warnings fatal by default isn't a very palatable solution:

libjpeg has traditionally (and by "traditionally", I mean since the early 90's) handled warnings by calling emit_message() in the error handler but continuing to process the image.  Lots of programs (particularly image viewers and such) rely on this behavior, because it allows them to decode as much of a corrupt image as possible.  However, as pointed out by the Cure53 report, it also opens a couple of exploits (this one and 1252196.)  Changing the default behavior would effectively change the behavior of the libjpeg API, because it would cause that API to call back the error_exit() function in the error handler when a warning is encountered, instead of the emit_message() function.

Any program is already free to make warnings fatal, simply by implementing its own error handler and causing any call to emit_message() to trigger the same application behavior as error_exit().  However, most programs don't do that.  Most programs either use the "stock" error handler (jpeg_std_error()) or write their own that doesn't treat warnings as fatal, so most programs will be affected by this issue.  Thus, it would be very desirable to craft a fix that works around the issue in such a way that

(a) The default behavior of the libjpeg API is unchanged (thus, applications won't suddenly discover that error_exit() is being called when it previously wasn't.)

(b) All applications receive the fix, regardless of whether they are using a custom error handler or the default one.
Group: core-security → gfx-core-security
Whiteboard: [gfx-noted]
NOTE: The symptoms of this issue manifest in Firefox as follows:

Take the image generated by test004 (oom.jpg) and open it in Firefox using the appropriate file:/// URL.  On my machine, this causes the browser's memory usage to increase by about 1 GB, so it is pretty easy to generate a web page with multiple copies of the image and make the browser consume all available memory, as described in Josh's comment above.  However, in this case, closing the tab does restore things to normal.  The main danger here is that, if the ballooning memory usage causes the filesystem cache to thrash, the O/S could become unusable until the user force-quits all running applications, which may cause them to lose their work.  As the Cure53 report points out, this is mostly a danger under Un*x, as OS X and Windows generally do a better job of recovering from memory exhaustion.
I'm not sure we can have our cake and eat it as well here, though I'm obviously not intimately familiar with the implementation details of libjpeg-turbo.

It seems to me that if we want to fix the insecure behavior for as many folks as possible we'd need libjpeg-turbo to break backwards compat and treat warnings as errors. If we end up feeling that breaking backwards compat is necessary then calling error_exit() in the default handler obviously won't fix things for people with custom handlers, but I'm not sure how we could fix things for them without essentially ignoring the handlers (I just don't know enough about the handler impl, maybe DRC knows a way).

Is there a way to limit the breakage by only doing error_exit() for certain more dangerous types of errors?

The only other alternative I can think of is to use the API extension mechanism to introduce a flag toggling behavior for the default handler. You could do it either way - break backwards compat and let people restore it easily with the flag, or leave backwards compat and let people get the secure behavior by setting the flag.
(In reply to Josh Aas from comment #4)
> It seems to me that if we want to fix the insecure behavior for as many
> folks as possible we'd need libjpeg-turbo to break backwards compat and
> treat warnings as errors. If we end up feeling that breaking backwards
> compat is necessary then calling error_exit() in the default handler
> obviously won't fix things for people with custom handlers, but I'm not sure
> how we could fix things for them without essentially ignoring the handlers
> (I just don't know enough about the handler impl, maybe DRC knows a way).

I'm hoping that maybe there's a way to fix this algorithmically so that the codec detects this specific situation and trips an error, but I may be dreaming.  I don't know whether there may be some specified limit to the number of scans in a progressive Huffman or arithmetic file, but the file generated by test003.c has about 800,000 scans.  That definitely seems excessive.  :)

Also, the last sentence in the Cure53 description says:  "if error tolerance is desired here, skip the decoding of bogus scans that do not supply additional information."  That seems like a winning idea, but I'm not sure how to do it.

> Is there a way to limit the breakage by only doing error_exit() for certain
> more dangerous types of errors?

These are the warnings that are generated (many many times) with this particular bug:

If using progressive Huffman:
jdhuff.c:380 -- Corrupt JPEG data: premature end of data segment (JWRN_HIT_MARKER)
(appears always)
jdphuff.c:143 -- Inconsistent progression sequence for component 0 coefficient 0 (JWRN_BOGUS_PROGRESSION)
(appears only if using either the first or second sos[] value in test003.c)
jdphuff.c:147 -- Inconsistent progression sequence for component 0 coefficient 1 (JWRN_BOGUS_PROGRESSION)
(appears only if using either the first or third sos[] value in test003.c)

If using arithmetic:
jdarith.c:663 -- Inconsistent progression sequence for component 0 coefficient 0 (JWRN_BOGUS_PROGRESSION)
(appears if using the first or second sos[] value in test003.c)
jdarith.c:667 -- Inconsistent progression sequence for component 0 coefficient 1 (JWRN_BOGUS_PROGRESSION)
(appears if using the first or third sos[] value in test003.c)

Note that if using the fourth value of sos[], along with arithmetic coding, *no* warning appears (ugh.)  That may not matter for Mozilla, since you guys aren't using arithmetic decoding, but from the point of view of fixing this in libjpeg-turbo, it seems that we can't always rely on warnings to tell us that this bug is occurring.  Yet another reason why an algorithmic solution is desirable.

Also, treating the JWRN_HIT_MARKER warning at 380:jdhuff.c as an error would really defeat the purpose of fault tolerance-- that is, it's probably going to be that warning that is encountered most often when an image viewer tries to decode a corrupt JPEG, and it wants to display as much of the image as possible.

> The only other alternative I can think of is to use the API extension
> mechanism to introduce a flag toggling behavior for the default handler. You
> could do it either way - break backwards compat and let people restore it
> easily with the flag, or leave backwards compat and let people get the
> secure behavior by setting the flag.

We might have to resort to that.  If we did, then it would mean that applications wouldn't automatically inherit the fix, but maybe that's OK, because it seems like this is mainly going to be a concern for browsers and such.  Standalone image viewers/editors/converters will probably want to keep the existing behavior, so they can decode as much of the corrupt JPEG as possible, and if someone tries to run this pathological 800,000-scan JPEG through ImageMagick, they'll wait 30 seconds and CTRL-C it, so no harm done.

If we modified the behavior of the default emit_message() function, that should catch *most* applications.  Even if an application uses its own error manager, it's very rare for it to override emit_message(), and if it does, then it will be its responsibility to decide whether to treat warnings as errors or not.

For this specific bug, it would even be possible to set a warning limit-- even something as high as 100-- after which warnings would be treated with errors.  However, that wouldn't fix the other bug (https://bugzilla.mozilla.org/show_bug.cgi?id=1252196 AKA LJT-01-004), because that other bug only causes one warning to be triggered (JWRN_HIT_MARKER at 380:jdhuff.c.)
(NOTE: I just noticed that I got my bugs mixed up.  I was mostly talking about LJT-01-003 in the above comment, which is the other bug-- https://bugzilla.mozilla.org/show_bug.cgi?id=1252196).  This bug is LJT_01-004.)

Note that this issue is - AFAICT - unaddressed in the standard (or libjpegturbo, based on the lack of consensus about how to proceed.) https://libjpeg-turbo.org/pmwiki/uploads/About/TwoIssueswiththeJPEGStandard.pdf was authored to draw attention to this issue.

Group: gfx-core-security
Keywords: csectype-dos
QA Whiteboard: qa-not-actionable
See Also: → CVE-2023-32209

In the process of migrating remaining bugs to the new severity system, the severity for this bug cannot be automatically determined. Please retriage this bug using the new severity system.

Severity: major → --

Whereas Issue #1252196 is an exploit involving relatively tiny progressive JPEG images (both valid and invalid) with unreasonable numbers of scans, this issue is an exploit involving relatively tiny progressive JPEG images with an unreasonable image size. The JPEG format has 16-bit width and height fields, so images can be up to 4 gigapixels (64k x 64k) in size. When decompressing a multi-scan JPEG image, libjpeg[-turbo] must allocate a "whole-image buffer" in order to render each successive scan on top of the previously-decompressed scan. (It does this within the body of jpeg_start_decompress().) Thus, a tiny invalid progressive JPEG image can be generated that causes libjpeg[-turbo] to allocate a 4-gigapixel x 32-bit whole-image buffer. libjpeg[-turbo] encounters junk data in the invalid image and throws a warning, but since libjpeg API warnings aren't fatal by default, it continues trying to decompress the image, leading to memory exhaustion. The attack surface can be greatly reduced by making libjpeg API warnings fatal. (djpeg in libjpeg-turbo 2.1.x has a -strict argument that demonstrates how to do that.) However, it is still possible to craft a perfectly valid 2 MB JPEG image that requires gigabytes of memory to decompress, by taking advantage of the same "EOB run" feature that was exploited in Issue #1252196 (more specifically, the ability of the progressive JPEG format to represent large runs of zeroes using a very small amount of data.) As with Issue #1252196, the exploit can be greatly diminished by making libjpeg API warnings fatal, but the only way to completely eliminate it is to place a reasonable limit on the JPEG image dimensions or to place a limit on the amount of memory used by libjpeg-turbo. (Refer to the -maxmemory switch in djpeg.)

Severity: -- → S3
See Also: → 1823614

Filed bug 1823614 to give us a finite limit on the number of scans. Not sure if there are more/separate problems here from that.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: