Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp17979rwb; Mon, 26 Sep 2022 08:45:06 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7fgJsd/33mn7z/xgEg6e234Iu4gvwnPWhcHmwdgP5Lt0cpoLeS5z4GTB2a/aYiJRoRS+Ck X-Received: by 2002:a17:902:ab55:b0:176:8906:a3c5 with SMTP id ij21-20020a170902ab5500b001768906a3c5mr22944474plb.72.1664207105659; Mon, 26 Sep 2022 08:45:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664207105; cv=none; d=google.com; s=arc-20160816; b=Lr9B+u+hiWmwdRaD0+UirG0ujB5oqFXn7XslLt0DLPsi9UsOtkuloDMZchvdY/lVXI YFKXCIG17pJfARseoO6X8v54DOO8bbK3vyC2yF+io5x3MbU8J1yD5/MkVgYdZ4zpeTgP no1Jjl0b7tAZ8QCLokTkwRqDv1Cwn/rb7SoEPmRpWW7DCBmtjJ7kX0WKmmVBdybECMZn Vv4zMYurqd5rr49vuTLwv6+PurUqvbVWTVZxiF7gYB3g0uXBzoOoMgwjOQc3pM9BZHOP BWinUVZfyDaMJPJj/faLdNlAe/55/Y4UbqIpkUdkvMR91B+n1GKe5pXnQgN0Uwl7QFHp 4SvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:dkim-signature:dkim-signature; bh=SUnm6fFN7Cbco4Ciib9Dp4ei7l6IMJGpVSBxC1Y2qbA=; b=EYi8lyZ6QuCs3cdkpSgF2o8lfJkV8Nf7DZkyg+BC/L0PK9FHstntBHDKo42ouz7e+Y 3UWHfw++yP6Hs25WdN0xx5MEkZQG/XnQhOn0RJ+84lRKxyK+To2VUyw5nn4Br/MIk8qS UylyQvqt+VJg6fjoMAVcJw/Z/+QvOpDCZ7AJLR/Q4KyrySBz5OrovnzXYbMqAZpoJqw8 8h90w6JNpDVMrOy+fS1j1I4kB+OeC4ueycfmfUSXGlU7s+2FzrO/+iuxO8pMHXi3bqML LNNnUaSSB129ygF0o/Wc/ETjiVaolHHgn+dqUinurswKEYjO2G9NmskaqRA1CQrtLY1x 6MOw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=GINR8720; dkim=neutral (no key) header.i=@suse.cz header.s=susede2_ed25519 header.b=KgFpOhA7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l7-20020a170902f68700b001789faa628fsi12642239plg.395.2022.09.26.08.44.19; Mon, 26 Sep 2022 08:45:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=GINR8720; dkim=neutral (no key) header.i=@suse.cz header.s=susede2_ed25519 header.b=KgFpOhA7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234480AbiIZOtw (ORCPT + 99 others); Mon, 26 Sep 2022 10:49:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59428 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235417AbiIZOsx (ORCPT ); Mon, 26 Sep 2022 10:48:53 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1FAF771984; Mon, 26 Sep 2022 06:15:25 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id A1F231F9B0; Mon, 26 Sep 2022 13:15:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1664198123; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SUnm6fFN7Cbco4Ciib9Dp4ei7l6IMJGpVSBxC1Y2qbA=; b=GINR8720H7FRt/NwnAThMXKadZ/jHetEfHQabCeHjYvJC0Ay/N/yrY4L2IA45c9LoMhZCZ /uLFla1qp4N9dJxKYjP0KHnNnh00SxZWxr/BsVxiaGxK30LOcRDuqFnj7IDPt4SRjkUrrL EsXkiYJ7NpGujiE2BPJ/t1EnWDNjjQs= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1664198123; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SUnm6fFN7Cbco4Ciib9Dp4ei7l6IMJGpVSBxC1Y2qbA=; b=KgFpOhA7zTvFQoSfllErNdLrjLXyakGagE2DCW1UZbzMn8KUX2NJflS9dN3YuGifvDT+Zb Y/3YCIzKAg7vORDA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id DD79213486; Mon, 26 Sep 2022 13:15:22 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id d5MSNeqlMWPkZgAAMHmgww (envelope-from ); Mon, 26 Sep 2022 13:15:22 +0000 Message-ID: Date: Mon, 26 Sep 2022 15:15:22 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.0 Subject: Re: [PATCH v2 02/16] slab: Introduce kmalloc_size_roundup() To: Kees Cook Cc: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , linux-mm@kvack.org, "Ruhl, Michael J" , Hyeonggon Yoo <42.hyeyoo@gmail.com>, "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Greg Kroah-Hartman , Nick Desaulniers , Alex Elder , Josef Bacik , David Sterba , Sumit Semwal , =?UTF-8?Q?Christian_K=c3=b6nig?= , Jesse Brandeburg , Daniel Micay , Yonghong Song , Marco Elver , Miguel Ojeda , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org, linux-fsdevel@vger.kernel.org, intel-wired-lan@lists.osuosl.org, dev@openvswitch.org, x86@kernel.org, llvm@lists.linux.dev, linux-hardening@vger.kernel.org References: <20220923202822.2667581-1-keescook@chromium.org> <20220923202822.2667581-3-keescook@chromium.org> Content-Language: en-US From: Vlastimil Babka In-Reply-To: <20220923202822.2667581-3-keescook@chromium.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_SOFTFAIL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 9/23/22 22:28, Kees Cook wrote: > In the effort to help the compiler reason about buffer sizes, the > __alloc_size attribute was added to allocators. This improves the scope > of the compiler's ability to apply CONFIG_UBSAN_BOUNDS and (in the near > future) CONFIG_FORTIFY_SOURCE. For most allocations, this works well, > as the vast majority of callers are not expecting to use more memory > than what they asked for. > > There is, however, one common exception to this: anticipatory resizing > of kmalloc allocations. These cases all use ksize() to determine the > actual bucket size of a given allocation (e.g. 128 when 126 was asked > for). This comes in two styles in the kernel: > > 1) An allocation has been determined to be too small, and needs to be > resized. Instead of the caller choosing its own next best size, it > wants to minimize the number of calls to krealloc(), so it just uses > ksize() plus some additional bytes, forcing the realloc into the next > bucket size, from which it can learn how large it is now. For example: > > data = krealloc(data, ksize(data) + 1, gfp); > data_len = ksize(data); > > 2) The minimum size of an allocation is calculated, but since it may > grow in the future, just use all the space available in the chosen > bucket immediately, to avoid needing to reallocate later. A good > example of this is skbuff's allocators: > > data = kmalloc_reserve(size, gfp_mask, node, &pfmemalloc); > ... > /* kmalloc(size) might give us more room than requested. > * Put skb_shared_info exactly at the end of allocated zone, > * to allow max possible filling before reallocation. > */ > osize = ksize(data); > size = SKB_WITH_OVERHEAD(osize); > > In both cases, the "how much was actually allocated?" question is answered > _after_ the allocation, where the compiler hinting is not in an easy place > to make the association any more. This mismatch between the compiler's > view of the buffer length and the code's intention about how much it is > going to actually use has already caused problems[1]. It is possible to > fix this by reordering the use of the "actual size" information. > > We can serve the needs of users of ksize() and still have accurate buffer > length hinting for the compiler by doing the bucket size calculation > _before_ the allocation. Code can instead ask "how large an allocation > would I get for a given size?". > > Introduce kmalloc_size_roundup(), to serve this function so we can start > replacing the "anticipatory resizing" uses of ksize(). > > [1] https://github.com/ClangBuiltLinux/linux/issues/1599 > https://github.com/KSPP/linux/issues/183 > > Cc: Vlastimil Babka > Cc: Christoph Lameter > Cc: Pekka Enberg > Cc: David Rientjes > Cc: Joonsoo Kim > Cc: Andrew Morton > Cc: linux-mm@kvack.org > Signed-off-by: Kees Cook OK, added patch 1+2 to slab.git for-next branch. Had to adjust this one a bit, see below. > --- > include/linux/slab.h | 31 +++++++++++++++++++++++++++++++ > mm/slab.c | 9 ++++++--- > mm/slab_common.c | 20 ++++++++++++++++++++ > 3 files changed, 57 insertions(+), 3 deletions(-) > > diff --git a/include/linux/slab.h b/include/linux/slab.h > index 41bd036e7551..727640173568 100644 > --- a/include/linux/slab.h > +++ b/include/linux/slab.h > @@ -188,7 +188,21 @@ void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags) __r > void kfree(const void *objp); > void kfree_sensitive(const void *objp); > size_t __ksize(const void *objp); > + > +/** > + * ksize - Report actual allocation size of associated object > + * > + * @objp: Pointer returned from a prior kmalloc()-family allocation. > + * > + * This should not be used for writing beyond the originally requested > + * allocation size. Either use krealloc() or round up the allocation size > + * with kmalloc_size_roundup() prior to allocation. If this is used to > + * access beyond the originally requested allocation size, UBSAN_BOUNDS > + * and/or FORTIFY_SOURCE may trip, since they only know about the > + * originally allocated size via the __alloc_size attribute. > + */ > size_t ksize(const void *objp); > + > #ifdef CONFIG_PRINTK > bool kmem_valid_obj(void *object); > void kmem_dump_obj(void *object); > @@ -779,6 +793,23 @@ extern void kvfree(const void *addr); > extern void kvfree_sensitive(const void *addr, size_t len); > > unsigned int kmem_cache_size(struct kmem_cache *s); > + > +/** > + * kmalloc_size_roundup - Report allocation bucket size for the given size > + * > + * @size: Number of bytes to round up from. > + * > + * This returns the number of bytes that would be available in a kmalloc() > + * allocation of @size bytes. For example, a 126 byte request would be > + * rounded up to the next sized kmalloc bucket, 128 bytes. (This is strictly > + * for the general-purpose kmalloc()-based allocations, and is not for the > + * pre-sized kmem_cache_alloc()-based allocations.) > + * > + * Use this to kmalloc() the full bucket size ahead of time instead of using > + * ksize() to query the size after an allocation. > + */ > +size_t kmalloc_size_roundup(size_t size); > + > void __init kmem_cache_init_late(void); > > #if defined(CONFIG_SMP) && defined(CONFIG_SLAB) > diff --git a/mm/slab.c b/mm/slab.c > index 10e96137b44f..2da862bf6226 100644 > --- a/mm/slab.c > +++ b/mm/slab.c > @@ -4192,11 +4192,14 @@ void __check_heap_object(const void *ptr, unsigned long n, > #endif /* CONFIG_HARDENED_USERCOPY */ > > /** > - * __ksize -- Uninstrumented ksize. > + * __ksize -- Report full size of underlying allocation > * @objp: pointer to the object > * > - * Unlike ksize(), __ksize() is uninstrumented, and does not provide the same > - * safety checks as ksize() with KASAN instrumentation enabled. > + * This should only be used internally to query the true size of allocations. > + * It is not meant to be a way to discover the usable size of an allocation > + * after the fact. Instead, use kmalloc_size_roundup(). Using memory beyond > + * the originally requested allocation size may trigger KASAN, UBSAN_BOUNDS, > + * and/or FORTIFY_SOURCE. > * > * Return: size of the actual memory used by @objp in bytes > */ > diff --git a/mm/slab_common.c b/mm/slab_common.c > index 457671ace7eb..d7420cf649f8 100644 > --- a/mm/slab_common.c > +++ b/mm/slab_common.c > @@ -721,6 +721,26 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags) > return kmalloc_caches[kmalloc_type(flags)][index]; > } > > +size_t kmalloc_size_roundup(size_t size) > +{ > + struct kmem_cache *c; > + > + /* Short-circuit the 0 size case. */ > + if (unlikely(size == 0)) > + return 0; > + /* Short-circuit saturated "too-large" case. */ > + if (unlikely(size == SIZE_MAX)) > + return SIZE_MAX; > + /* Above the smaller buckets, size is a multiple of page size. */ > + if (size > KMALLOC_MAX_CACHE_SIZE) > + return PAGE_SIZE << get_order(size); > + > + /* The flags don't matter since size_index is common to all. */ > + c = kmalloc_slab(size, GFP_KERNEL); > + return c ? c->object_size : 0; > +} > +EXPORT_SYMBOL(kmalloc_size_roundup); We need a SLOB version too as it's not yet removed... I added this: diff --git a/mm/slob.c b/mm/slob.c index 2bd4f476c340..5dbdf6ad8bcc 100644 --- a/mm/slob.c +++ b/mm/slob.c @@ -574,6 +574,20 @@ void kfree(const void *block) } EXPORT_SYMBOL(kfree); +size_t kmalloc_size_roundup(size_t size) +{ + /* Short-circuit the 0 size case. */ + if (unlikely(size == 0)) + return 0; + /* Short-circuit saturated "too-large" case. */ + if (unlikely(size == SIZE_MAX)) + return SIZE_MAX; + + return ALIGN(size, ARCH_KMALLOC_MINALIGN); +} + +EXPORT_SYMBOL(kmalloc_size_roundup); + /* can't use ksize for kmem_cache_alloc memory, only kmalloc */ size_t __ksize(const void *block) {