Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752976AbcDUUEs (ORCPT ); Thu, 21 Apr 2016 16:04:48 -0400 Received: from mail-wm0-f47.google.com ([74.125.82.47]:37343 "EHLO mail-wm0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751871AbcDUUEp (ORCPT ); Thu, 21 Apr 2016 16:04:45 -0400 MIME-Version: 1.0 In-Reply-To: <20160421144703.GB29616@pd.tnic> References: <1461185746-8017-1-git-send-email-keescook@chromium.org> <1461185746-8017-2-git-send-email-keescook@chromium.org> <20160421144703.GB29616@pd.tnic> Date: Thu, 21 Apr 2016 13:04:42 -0700 X-Google-Sender-Auth: SUcUHuOpRp9RQRfKFeoSftFMiPc Message-ID: Subject: Re: [PATCH 1/5] x86, KASLR: Update description for decompressor worst case size From: Kees Cook To: Borislav Petkov Cc: Ingo Molnar , Baoquan He , Yinghai Lu , Ingo Molnar , "x86@kernel.org" , Andrew Morton , Andrey Ryabinin , Dmitry Vyukov , "H.J. Lu" , Josh Poimboeuf , Andy Lutomirski , LKML Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4537 Lines: 121 On Thu, Apr 21, 2016 at 7:47 AM, Borislav Petkov wrote: > On Wed, Apr 20, 2016 at 01:55:42PM -0700, Kees Cook wrote: > ... >> diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S >> index 6236b9ec4b76..6b8f8728c1fa 100644 >> --- a/arch/x86/boot/header.S >> +++ b/arch/x86/boot/header.S >> @@ -440,6 +440,93 @@ setup_data: .quad 0 # 64-bit physical pointer to >> >> pref_address: .quad LOAD_PHYSICAL_ADDR # preferred load addr >> >> +# >> +# Getting to provably safe in-place decompression is hard. Worst case >> +# behaviours need be analyzed. Here let's take decompressing gzip-compressed >> +# kernel as example to illustrate it: >> +# >> +# The file layout of gzip compressed kernel is as follows. For more >> +# information, please refer to RFC 1951 and RFC 1952. >> +# >> +# magic[2] >> +# method[1] >> +# flags[1] >> +# timestamp[4] >> +# extraflags[1] >> +# os[1] >> +# compressed data blocks[N] >> +# crc[4] orig_len[4] >> +# >> +# resulting in 18 bytes of non compressed data overhead. > > What's "non compressed data overhead"? > > Does that want to say: > > "... resulting in 18 bytes overhead of uncompressed data." > > perhaps? Yeah, that reads much more clearly. I'll change it. >> +# >> +# Files divided into blocks >> +# 1 bit (last block flag) >> +# 2 bits (block type) >> +# >> +# 1 block occurs every 32K -1 bytes or when there 50% compression >> +# has been achieved. The smallest block type encoding is always used. >> +# >> +# stored: >> +# 32 bits length in bytes. >> +# >> +# fixed: >> +# magic fixed tree. >> +# symbols. >> +# >> +# dynamic: >> +# dynamic tree encoding. >> +# symbols. >> +# >> +# >> +# The buffer for decompression in place is the length of the uncompressed >> +# data, plus a small amount extra to keep the algorithm safe. The >> +# compressed data is placed at the end of the buffer. The output pointer >> +# is placed at the start of the buffer and the input pointer is placed >> +# where the compressed data starts. Problems will occur when the output >> +# pointer overruns the input pointer. >> +# >> +# The output pointer can only overrun the input pointer if the input >> +# pointer is moving faster than the output pointer. A condition only >> +# triggered by data whose compressed form is larger than the uncompressed >> +# form. >> +# >> +# The worst case at the block level is a growth of the compressed data >> +# of 5 bytes per 32767 bytes. >> +# >> +# The worst case internal to a compressed block is very hard to figure. >> +# The worst case can at least be bounded by having one bit that represents >> +# 32764 bytes and then all of the rest of the bytes representing the very >> +# very last byte. >> +# >> +# All of which is enough to compute an amount of extra data that is required >> +# to be safe. To avoid problems at the block level allocating 5 extra bytes >> +# per 32767 bytes of data is sufficient. To avoid problems internal to a >> +# block adding an extra 32767 bytes (the worst case uncompressed block size) >> +# is sufficient, to ensure that in the worst case the decompressed data for >> +# block will stop the byte before the compressed data for a block begins. >> +# To avoid problems with the compressed data's meta information an extra 18 >> +# bytes are needed. Leading to the formula: >> +# >> +# extra_bytes = (uncompressed_size >> 12) + 32768 + 18 + decompressor_size >> +# >> +# Adding 8 bytes per 32K is a bit excessive but much easier to calculate. >> +# Adding 32768 instead of 32767 just makes for round numbers. >> +# Adding the decompressor_size is necessary as it musht live after all >> +# of the data as well. Last I measured the decompressor is about 14K. >> +# 10K of actual data and 4K of bss. > > I guess reflow the paragraphs while at it, as well? > > "Adding 8 bytes per 32K is a bit excessive but much easier to calculate. > Adding 32768 instead of 32767 just makes for round numbers. Adding the > decompressor_size is necessary as it musht live after all of the data as > well. Last I measured the decompressor is about 14K. 10K of actual data > and 4K of bss." > > and so on... Yeah, I'd been reflowing as I went and I went back and forth on that one. It looked like it was a list ("Adding... Adding... Adding...") so I'd left it, but my initial instinct matches your: it should just get reflowed like all the rest. I'll fix that too. Thanks for the review! -Kees -- Kees Cook Chrome OS & Brillo Security