Open Bug 1354177 Opened 7 years ago Updated 2 years ago

libjpeg-turbo: shift exponent -1 is negative [@ decode_mcu_fast]

Categories

(Core :: Graphics: ImageLib, defect, P3)

defect

Tracking

()

People

(Reporter: tsmith, Unassigned)

Details

(Keywords: csectype-undefined, testcase, Whiteboard: gfx-noted)

Attachments

(2 files)

Attached image test_case.jpg
Found while fuzzing libjpeg-turbo commit da2a27ef.

STR:
Build with -fsanitize=shift
./djpeg -dct fast -nosmooth -onepass -dither none -scale 1/4 test_cases.jpg

jdhuff.c:672:5: runtime error: shift exponent -1 is negative
    #0 0x823f6cb in decode_mcu_fast jdhuff.c:672:5
    #1 0x823f6cb in decode_mcu jdhuff.c:779
    #2 0x820ecdb in consume_data jdcoefct.c:232:13
    #3 0x817c093 in jpeg_start_decompress jdapistd.c:65:19
    #4 0x81572a1 in main djpeg.c:664:10
    #5 0xf74fc636 in __libc_start_main (/lib32/libc.so.6+0x18636)
    #6 0x8062f07 in _start (djpeg+0x8062f07)
I've been unfortunately unable to reproduce this.  Can you provide more information, including the compiler and O/S used?  I've tried it with Clang and GCC 5 & 6.  No dice.
Never mind.  I keep forgetting I have to build with 32-bit code to repro these things.  Investigating.
Hmmm...  Well, the core problem with this is not the negative shift.  The problem is that, at the time the error occurs, bits_left is 0, and that shouldn't ever happen.  When entering decode_mcu(), cinfo->src->bytes_in_buffer just happens to be exactly 512, which is exactly equal to the break point between the fast and slow Huffman paths.  Replacing

  if (cinfo->src->bytes_in_buffer < BUFSIZE * (size_t)cinfo->blocks_in_MCU

with

  if (cinfo->src->bytes_in_buffer <= BUFSIZE * (size_t)cinfo->blocks_in_MCU

works around the issue, but I'm not sure yet whether that's the ultimate fix.  I need to develop a better understanding of the failure mechanism.
Ignore most of what I said above.  The buffer size was a red herring.  The actual number of bytes decoded, if the MCU is allowed to finish decoding after the negative shift, is only 4, so it's definitely not a problem with the buffer size.  It's just that the slow Huffman decoder explicitly checks whether bits_left is 0 or not.  The fast Huffman decoder doesn't do that for performance reasons-- because even that one additional conditional in the inner loop causes a 5% slow-down on average (and I just tried it, so I'm not speaking hypothetically.)  If the fast decoder encounters a runaway MCU like this, it guards against it by clamping the result at 255 so as not to overrun the table.  Thus, as far as I can tell, this error is innocuous and not exploitable, but I cannot assert that with anything close to 100% confidence.

Given that the actual error is occurring at jdhuff.h:226, further research seems to indicate that I could get away with adding a single conditional right above that while() loop, in order to check the value of bits_left before going into the loop.  That at least seems to be performance-neutral.  However, I have no clue what I should check bits_left against, since the number of iterations in the while() loop will depend on the values in the maxcode table.  At first glance, it would seem that I would need to pre-compute another table, which is a non-starter.

At the moment, unless anyone has any further insight, I don't know how to fix this and am disinclined to try unless it can be shown to be exploitable.
I tried debugging with gdb and the info I wanted was optimized out. I tried printf debugging and the behavior changed. I'm not exactly sure which line in the macro is even causing UB report.
It's possible that the copy of libjpegturbo that we have in the tree is not affected by this because we apply a patch to the fast huffman decoder that libjpegturbo does not have.

https://dxr.mozilla.org/mozilla-central/source/media/libjpeg/1050342.diff
No, I can still repro it using the Mozilla patch.  The issue occurs in this line of the macro:

  s |= GET_BITS(1); \

Basically, the sequence is as follows:
- decode_mcu is called with cinfo->src->bytes_in_buffer == 512, which causes the fast Huffman decoder to be used.
- The shift error occurs almost immediately, on the first iteration of the blkn loop in decode_mcu_fast(), on the first invocation of HUFF_DECODE_FAST (when decoding the DC coefficient.)

The sequence within HUFF_DECODE_FAST is as follows (NOTE: HUFF_LOOKAHEAD == 8):

  FILL_BIT_BUFFER_FAST; \
  /* This fills the bit buffer with (word size - 16) bits of new data, so when using 32-bit code, bits_left == 16 at this point, and when using 64-bit code, bits_left == 48 at this point.  That's why using 32-bit code is key to reproducing this. */

  s = PEEK_BITS(HUFF_LOOKAHEAD); \
  /* s == 239 */

  s = htbl->lookup[s]; \
  /* s == 2304 */

  nb = s >> HUFF_LOOKAHEAD; \
  /* nb == 9 */

  DROP_BITS(nb); \
  /* bits_left == 7 */

  s = s & ((1 << HUFF_LOOKAHEAD) - 1); \
  /* s == 0 */

  if (nb > HUFF_LOOKAHEAD) { \
    s = (get_buffer >> bits_left) & ((1 << (nb)) - 1); \
    /* s == 478 */

    while (s > htbl->maxcode[nb]) { \
      /* At this point, htbl->maxcode[9..16] are all -1, so the loop will iterate for 8 times until hitting the guardrail (maxcode[17] == 0xFFFFFL).  But since bits_left == 7 on the first iteration of the loop, it will be -1 on the last iteration, which is when the negative shift error occurs. */

I think I was able to come up with a fix after doing the above analysis.  Stand by.
Unfortunately, while the fix I came up with was better than the first one, it still produces a 5% regression in some cases.  :(  I give up.
One thing I will add is that, to the best of my understanding, this represents a worst case for the algorithm.  I don't think nb can be greater than 9, meaning that bits_left won't ever be less than 7, and thus the worst case is that this negative shift occurs only on the last iteration of the loop.  It seems to me as if the only real consequence is bogus output, but given that the input is bogus, I'm not that concerned about that.  However, it would be nice to have a better understanding of why the Huffman table in this specific file is generating this behavior.  If there's something bogus about the table, then maybe that can be trapped as the table is read from the file.

Note also that the file is triggering a libjpeg warning (JWRN_NOT_SEQUENTIAL) prior to the negative shift occurring, so it would be quite easy to mitigate this situation by trapping the warning and treating it as an error (which I recommend that all security-conscious applications do anyhow-- refer to http://www.libjpeg-turbo.org/pmwiki/uploads/About/TwoIssueswiththeJPEGStandard.pdf for other reasons why.)
Attaching the patch that had the least performance impact, in case Mozilla developers want to evaluate it further.  At the moment, I am not interested in pursuing a fix for this upstream unless:

(a) It proves to be an exploitable issue, or
(b) It can be reproduced with a valid JPEG file, or
(c) further information comes to light regarding a method of fixing it without affecting performance (for instance, a method of predicting whether the issue will occur based on the contents of the Huffman tables.)
Because this patch has a measurable performance impact and its potential ramifications are not well understood (i.e. it is possible that it could cause other problems), it has not been accepted upstream.
Whiteboard: gfx-noted
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: