Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp642507rwb; Thu, 8 Dec 2022 00:40:28 -0800 (PST) X-Google-Smtp-Source: AA0mqf72pVmgp9fuTOKG8EpgY0YB99SzDpmDk36foLVoBW5oGECKGdiwG2YR5midKwITYKK/0LlE X-Received: by 2002:a17:906:1cda:b0:7c0:b9cc:2a60 with SMTP id i26-20020a1709061cda00b007c0b9cc2a60mr22551410ejh.362.1670488828699; Thu, 08 Dec 2022 00:40:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670488828; cv=none; d=google.com; s=arc-20160816; b=EikNxyX4G6I0oY04lPn2IAWMNGis12g1UPvROau7hNK/Ct/puCV0BYNeCC29kVokjz hD/N57bY6cyCLzB9BU1NCcE+UUms9sfyScsZ+otYPH6aWqB5hNiW2C9m3mZ5xDxNRNpZ kRvAwxG38s8XVNvlckaA6LGArJM1vzOpRvg7si451Ay5CFJ3pGkE0vn+lrsUPbXLp8iq h66DifqDOqbjSq5BBE/UIW+7HpO/9iszv16vdPeTEz7AN7O0klpFpk1jZvAU3zgkAA9g Wfeib1rq5Uc1hefMRqBZBBYgmesFECBMRvWvaZCYAJu8PokDwG/NiOIxaYi/g/kxcJho O1kQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:dkim-signature:dkim-signature; bh=jL4y5FJYbDHm9XGKTrDmZ6J5zAhRaZRJLpsFAXD4/0o=; b=UZkX8BR0yqUmVlLq8kjZAVo76sckxACgcczPWjK89Ymc0UNnDFPEpVBqkNQCBsDbPr GjuEzRLJroi7HV6AjyIaj6C4p2jKn/xiDEM9SJHK4OVzz2x7C7Wpsd0s5Bx5g3gmnrEZ uUgkfShAmQdoaO+y84Q4uWF/DBFURgQXuOtzMXjPJycIk8/Z6t6IRRl/ZZR2K1zcObVw gjBm5fSSzLhcNgo/PpU/nWhh2Ytn7NG7Y0QNgvij/0Cuy4nAn4P5g6RoSHeyDLAb3JW7 BNbjWmzq5XWVbgsuGElZ1C/eMq8is2iz+vsT7HwlReO9ZoP+JEigAXjhVUUJPWU9YXzZ nqfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=bUmI0w3L; dkim=neutral (no key) header.i=@suse.cz header.s=susede2_ed25519; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g10-20020a056402090a00b0045cea9cf231si6697435edz.566.2022.12.08.00.40.11; Thu, 08 Dec 2022 00:40:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=bUmI0w3L; dkim=neutral (no key) header.i=@suse.cz header.s=susede2_ed25519; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229514AbiLHINy (ORCPT + 73 others); Thu, 8 Dec 2022 03:13:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53314 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229662AbiLHINp (ORCPT ); Thu, 8 Dec 2022 03:13:45 -0500 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1FE20B1F; Thu, 8 Dec 2022 00:13:43 -0800 (PST) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 5971A33697; Thu, 8 Dec 2022 08:13:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1670487222; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jL4y5FJYbDHm9XGKTrDmZ6J5zAhRaZRJLpsFAXD4/0o=; b=bUmI0w3LiRlTcWoMMsF7HYKbPvEOxVnb/AfLz53RFL8tnZy/0vujoh8ukqEXQcwRpYPmbc /VFErUGzHPbzt3R6Zbs5PnXdb9qeh+kbcoJMHj7PCiTtgEmhq4qLWmhc3ilNKMRFF8bPf6 oX45AMzM5tF0aP4WF7jcEd34tw2+ctw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1670487222; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jL4y5FJYbDHm9XGKTrDmZ6J5zAhRaZRJLpsFAXD4/0o=; b=CE+RarLm/bXrKJ8p4njmXUm5YCYVHkyy0c4d8Pu/EfDrnoSBRnq823VqlWiVYbr3DJyaRh cz+2/WxKhqY8SGBg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id D55BD138E0; Thu, 8 Dec 2022 08:13:41 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id NGJkM7WckWPMHQAAMHmgww (envelope-from ); Thu, 08 Dec 2022 08:13:41 +0000 Message-ID: <6923d6a9-7728-fc71-f963-3617e5361732@suse.cz> Date: Thu, 8 Dec 2022 09:13:41 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.5.1 Subject: Re: [PATCH net-next v3] skbuff: Introduce slab_build_skb() To: Kees Cook , Jakub Kicinski Cc: syzbot+fda18eaa8c12534ccb3b@syzkaller.appspotmail.com, Eric Dumazet , "David S. Miller" , Paolo Abeni , Pavel Begunkov , pepsipu , kasan-dev , Andrii Nakryiko , ast@kernel.org, bpf , Daniel Borkmann , Hao Luo , Jesper Dangaard Brouer , John Fastabend , jolsa@kernel.org, KP Singh , martin.lau@linux.dev, Stanislav Fomichev , song@kernel.org, Yonghong Song , netdev@vger.kernel.org, LKML , Rasesh Mody , Ariel Elior , Manish Chopra , Menglong Dong , David Ahern , Richard Gobert , Andrey Konovalov , David Rientjes , GR-Linux-NIC-Dev@marvell.com, linux-hardening@vger.kernel.org, Feng Tang References: <20221208060256.give.994-kees@kernel.org> Content-Language: en-US From: Vlastimil Babka In-Reply-To: <20221208060256.give.994-kees@kernel.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 12/8/22 07:02, Kees Cook wrote: > syzkaller reported: > > BUG: KASAN: slab-out-of-bounds in __build_skb_around+0x235/0x340 net/core/skbuff.c:294 > Write of size 32 at addr ffff88802aa172c0 by task syz-executor413/5295 > > For bpf_prog_test_run_skb(), which uses a kmalloc()ed buffer passed to > build_skb(). > > When build_skb() is passed a frag_size of 0, it means the buffer came > from kmalloc. In these cases, ksize() is used to find its actual size, > but since the allocation may not have been made to that size, actually > perform the krealloc() call so that all the associated buffer size > checking will be correctly notified (and use the "new" pointer so that > compiler hinting works correctly). Split this logic out into a new > interface, slab_build_skb(), but leave the original 0 checking for now > to catch any stragglers. > > Reported-by: syzbot+fda18eaa8c12534ccb3b@syzkaller.appspotmail.com > Link: https://groups.google.com/g/syzkaller-bugs/c/UnIKxTtU5-0/m/-wbXinkgAQAJ > Fixes: 38931d8989b5 ("mm: Make ksize() a reporting-only function") > Cc: Jakub Kicinski > Cc: Eric Dumazet > Cc: "David S. Miller" > Cc: Paolo Abeni > Cc: Pavel Begunkov > Cc: pepsipu > Cc: syzbot+fda18eaa8c12534ccb3b@syzkaller.appspotmail.com > Cc: Vlastimil Babka > Cc: kasan-dev > Cc: Andrii Nakryiko > Cc: ast@kernel.org > Cc: bpf > Cc: Daniel Borkmann > Cc: Hao Luo > Cc: Jesper Dangaard Brouer > Cc: John Fastabend > Cc: jolsa@kernel.org > Cc: KP Singh > Cc: martin.lau@linux.dev > Cc: Stanislav Fomichev > Cc: song@kernel.org > Cc: Yonghong Song > Cc: netdev@vger.kernel.org > Cc: LKML > Signed-off-by: Kees Cook > --- > v3: > - make sure "resized" is passed back so compiler hints survive > - update kerndoc (kuba) > v2: https://lore.kernel.org/lkml/20221208000209.gonna.368-kees@kernel.org > v1: https://lore.kernel.org/netdev/20221206231659.never.929-kees@kernel.org/ > --- > drivers/net/ethernet/broadcom/bnx2.c | 2 +- > drivers/net/ethernet/qlogic/qed/qed_ll2.c | 2 +- > include/linux/skbuff.h | 1 + > net/bpf/test_run.c | 2 +- > net/core/skbuff.c | 70 ++++++++++++++++++++--- > 5 files changed, 66 insertions(+), 11 deletions(-) > > diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c > index fec57f1982c8..b2230a4a2086 100644 > --- a/drivers/net/ethernet/broadcom/bnx2.c > +++ b/drivers/net/ethernet/broadcom/bnx2.c > @@ -3045,7 +3045,7 @@ bnx2_rx_skb(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr, u8 *data, > > dma_unmap_single(&bp->pdev->dev, dma_addr, bp->rx_buf_use_size, > DMA_FROM_DEVICE); > - skb = build_skb(data, 0); > + skb = slab_build_skb(data); > if (!skb) { > kfree(data); > goto error; > diff --git a/drivers/net/ethernet/qlogic/qed/qed_ll2.c b/drivers/net/ethernet/qlogic/qed/qed_ll2.c > index ed274f033626..e5116a86cfbc 100644 > --- a/drivers/net/ethernet/qlogic/qed/qed_ll2.c > +++ b/drivers/net/ethernet/qlogic/qed/qed_ll2.c > @@ -200,7 +200,7 @@ static void qed_ll2b_complete_rx_packet(void *cxt, > dma_unmap_single(&cdev->pdev->dev, buffer->phys_addr, > cdev->ll2->rx_size, DMA_FROM_DEVICE); > > - skb = build_skb(buffer->data, 0); > + skb = slab_build_skb(buffer->data); > if (!skb) { > DP_INFO(cdev, "Failed to build SKB\n"); > kfree(buffer->data); > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h > index 7be5bb4c94b6..0b391b635430 100644 > --- a/include/linux/skbuff.h > +++ b/include/linux/skbuff.h > @@ -1253,6 +1253,7 @@ struct sk_buff *build_skb_around(struct sk_buff *skb, > void skb_attempt_defer_free(struct sk_buff *skb); > > struct sk_buff *napi_build_skb(void *data, unsigned int frag_size); > +struct sk_buff *slab_build_skb(void *data); > > /** > * alloc_skb - allocate a network buffer > diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c > index 13d578ce2a09..611b1f4082cf 100644 > --- a/net/bpf/test_run.c > +++ b/net/bpf/test_run.c > @@ -1130,7 +1130,7 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr, > } > sock_init_data(NULL, sk); > > - skb = build_skb(data, 0); > + skb = slab_build_skb(data); > if (!skb) { > kfree(data); > kfree(ctx); > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > index 1d9719e72f9d..ae5a6f7db37b 100644 > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -269,12 +269,10 @@ static struct sk_buff *napi_skb_cache_get(void) > return skb; > } > > -/* Caller must provide SKB that is memset cleared */ > -static void __build_skb_around(struct sk_buff *skb, void *data, > - unsigned int frag_size) > +static inline void __finalize_skb_around(struct sk_buff *skb, void *data, > + unsigned int size) > { > struct skb_shared_info *shinfo; > - unsigned int size = frag_size ? : ksize(data); > > size -= SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > > @@ -296,15 +294,71 @@ static void __build_skb_around(struct sk_buff *skb, void *data, > skb_set_kcov_handle(skb, kcov_common_handle()); > } > > +static inline void *__slab_build_skb(struct sk_buff *skb, void *data, > + unsigned int *size) > +{ > + void *resized; > + > + /* Must find the allocation size (and grow it to match). */ > + *size = ksize(data); > + /* krealloc() will immediately return "data" when > + * "ksize(data)" is requested: it is the existing upper > + * bounds. As a result, GFP_ATOMIC will be ignored. Note > + * that this "new" pointer needs to be passed back to the > + * caller for use so the __alloc_size hinting will be > + * tracked correctly. > + */ > + resized = krealloc(data, *size, GFP_ATOMIC); Hmm, I just realized, this trick will probably break the new kmalloc size tracking from Feng Tang (CC'd)? We need to make krealloc() update the stored size, right? And even worse if slab_debug redzoning is enabled and after commit 946fa0dbf2d8 ("mm/slub: extend redzone check to extra allocated kmalloc space than requested") where the lack of update will result in redzone check failures. > + WARN_ON_ONCE(resized != data); > + return resized; > +} > + > +/* build_skb() variant which can operate on slab buffers. > + * Note that this should be used sparingly as slab buffers > + * cannot be combined efficiently by GRO! > + */ > +struct sk_buff *slab_build_skb(void *data) > +{ > + struct sk_buff *skb; > + unsigned int size; > + > + skb = kmem_cache_alloc(skbuff_head_cache, GFP_ATOMIC); > + if (unlikely(!skb)) > + return NULL; > + > + memset(skb, 0, offsetof(struct sk_buff, tail)); > + data = __slab_build_skb(skb, data, &size); > + __finalize_skb_around(skb, data, size); > + > + return skb; > +} > +EXPORT_SYMBOL(slab_build_skb); > + > +/* Caller must provide SKB that is memset cleared */ > +static void __build_skb_around(struct sk_buff *skb, void *data, > + unsigned int frag_size) > +{ > + unsigned int size = frag_size; > + > + /* frag_size == 0 is considered deprecated now. Callers > + * using slab buffer should use slab_build_skb() instead. > + */ > + if (WARN_ONCE(size == 0, "Use slab_build_skb() instead")) > + data = __slab_build_skb(skb, data, &size); > + __finalize_skb_around(skb, data, size); > +} > + > /** > * __build_skb - build a network buffer > * @data: data buffer provided by caller > - * @frag_size: size of data, or 0 if head was kmalloced > + * @frag_size: size of data (must not be 0) > * > * Allocate a new &sk_buff. Caller provides space holding head and > - * skb_shared_info. @data must have been allocated by kmalloc() only if > - * @frag_size is 0, otherwise data should come from the page allocator > - * or vmalloc() > + * skb_shared_info. @data must have been allocated from the page > + * allocator or vmalloc(). (A @frag_size of 0 to indicate a kmalloc() > + * allocation is deprecated, and callers should use slab_build_skb() > + * instead.) > * The return is the new skb buffer. > * On a failure the return is %NULL, and @data is not freed. > * Notes :