Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp616009pxu; Thu, 3 Dec 2020 08:30:28 -0800 (PST) X-Google-Smtp-Source: ABdhPJwOCqli+l1hFfJI7GipCsdBdR18m/uSTuvagvCkxEthqc64m6rPfyocZb9h7zrhvpkxX4uq X-Received: by 2002:a50:ce48:: with SMTP id k8mr3554152edj.298.1607013028132; Thu, 03 Dec 2020 08:30:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1607013028; cv=none; d=google.com; s=arc-20160816; b=oaxzLz2R2EY1ZkAY9gvW0ucR0jDBJRFCt9X4YK6oWOKdbUAO4F9t9+mi2RIeOE7iIz oZcKQS61arCLyPzxfwRgTXYFSQFzou5tIfwYM3nQ2dNfEBjvTMvRQCsUXl8cGUzGIb72 Q/lbssfKqwlu0z+MqjbqrMFFqbPxG66fP0BcMtywL89Wt0bjtOUH8fqXVnkTu2J0jRta 8pJBXKejvsmukWTV28d7iejuJlJBQZljJ2O/2Xn6xJaepS2KeaTXLo2NUk0WtLi7DWBq B8crDfZDYPsH3lsF9nCukhomfSJ5I3sqq9rOjrjAjTXvoCGJyOwPRFogWIIioVSP7Pcs X/DA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=cOQxk743ljCBxb7kE7j4bu/7hb7YV+trjyBCFiVsJYM=; b=wHVouLwp3VfmsqW6Tu3aXtgIayFaFZYnAW3BZrC82gJC/5LiKoTtx937QKE3EbmjgB /T79Y42sTjNiwrI0w6obCEazOAF35mHupqy9SbDRZO6QIdRh9tLfbHFXZQKAHNS2iT/J gw9/QzXEwmkNJ05YCo2kpXiVvfYBqwt2ULBMjNwwtaVJN/DxNYk/dFUTJ+rk/WLE2Fdi W4lvIqPSRH9U2NdsJpzKRnVU6cnureZ4TT2+fcNn9bo1oYH8yqmquJmSLIcUKqMic1Yu j9iFssz1IHldJnFcq0ozmmo8qU9yhK4fTYrwn1OxACdl3RCVPXrSxXGMr1AUbYUr2t16 ccBA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=WwCX2nfA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u15si1378106edt.525.2020.12.03.08.30.02; Thu, 03 Dec 2020 08:30:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=WwCX2nfA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731267AbgLCQ2G (ORCPT + 99 others); Thu, 3 Dec 2020 11:28:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58538 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731007AbgLCQ2F (ORCPT ); Thu, 3 Dec 2020 11:28:05 -0500 Received: from mail-il1-x142.google.com (mail-il1-x142.google.com [IPv6:2607:f8b0:4864:20::142]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 59616C061A51 for ; Thu, 3 Dec 2020 08:27:19 -0800 (PST) Received: by mail-il1-x142.google.com with SMTP id k8so2461808ilr.4 for ; Thu, 03 Dec 2020 08:27:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=cOQxk743ljCBxb7kE7j4bu/7hb7YV+trjyBCFiVsJYM=; b=WwCX2nfAUKU80kW+GxDa45X29jAbFTPCHGimVTXC0aJJqoQaHy95iVuXZZ7gvpoCMb zPk9vF5LDoCraov0cJaDPKZBqAcQa70G4hmWAWzMtK3iAWA0zX4o4q8La40T15ajxA4u SqgjiGqYTpahb1uSXRLRiXnQ5AzA3VhlFfMdopRd0n0+tDskQawcixxFz1vczy7FbRRE bwIodnpe1HRrbT1r6DPWCg5+gDDxlwGsw2itknHvMpObi+uHZLgIXD42SBm8JTlrA+sE Wzpzw/a2NVhIAXustBuO6OZ729SDJz2Ss6e6BKZzNkY7e+6W4whToe0rhHCdEISd9LWy rX3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=cOQxk743ljCBxb7kE7j4bu/7hb7YV+trjyBCFiVsJYM=; b=ZmrdthCuDchUtlvVQsZP6Ps5J9Biy2fuBq/nzbgYOAP2gQuSxgll/P2sv+gfmLM4Xq gMBKLSrlK14eWVWrGV9FcVsVFEHQPhfqB+rSfGZmSZIb6PftY275LK01OPUtes3vWYmT DQWdCaFdYF1Q8KOgEWfonMxSQjC4nUPVUEIwANw4FtfUuiE7O566rpiO52hJOg4A61ap e46Mg8zZzL7R97Z3pBXUvewKqrSUVFHuHjIvrFX+npIuYe2+/BH4F+vNigXb+EL+Ooq7 DA3H7ZG8H4OGIkHEM8Sc7ks2HEBziuczNC3hr6yyGqeajrM7fxKN46sYfagSJDaC534z bKuQ== X-Gm-Message-State: AOAM532+deDiGuNlwAmu8bCXFQ6DtW28s5QPZa1IsYkPm9TA4PGyO5Xn cuxKd2KUQNeUfJJwzqY0vUXCztzU6LhnH2plHW2Anw== X-Received: by 2002:a92:358e:: with SMTP id c14mr3509419ilf.69.1607012838185; Thu, 03 Dec 2020 08:27:18 -0800 (PST) MIME-Version: 1.0 References: <000000000000b4862805b54ef573@google.com> In-Reply-To: From: Eric Dumazet Date: Thu, 3 Dec 2020 17:27:06 +0100 Message-ID: Subject: Re: WARNING in sk_stream_kill_queues (5) To: Marco Elver Cc: netdev , Andrew Morton , David Miller , Dmitry Vyukov , Alexander Potapenko , Jann Horn , Jakub Kicinski , LKML , Stephen Rothwell , syzkaller-bugs , Willem de Bruijn , syzbot Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 3, 2020 at 4:58 PM Marco Elver wrote: > > On Mon, Nov 30, 2020 at 12:40AM -0800, syzbot wrote: > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit: 6147c83f Add linux-next specific files for 20201126 > > git tree: linux-next > > console output: https://syzkaller.appspot.com/x/log.txt?x=117c9679500000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=9b91566da897c24f > > dashboard link: https://syzkaller.appspot.com/bug?extid=7b99aafdcc2eedea6178 > > compiler: gcc (GCC) 10.1.0-syz 20200507 > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=103bf743500000 > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=167c60c9500000 > > > > The issue was bisected to: > > > > commit 145cd60fb481328faafba76842aa0fd242e2b163 > > Author: Alexander Potapenko > > Date: Tue Nov 24 05:38:44 2020 +0000 > > > > mm, kfence: insert KFENCE hooks for SLUB > > > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=13abe5b3500000 > > final oops: https://syzkaller.appspot.com/x/report.txt?x=106be5b3500000 > > console output: https://syzkaller.appspot.com/x/log.txt?x=17abe5b3500000 > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+7b99aafdcc2eedea6178@syzkaller.appspotmail.com > > Fixes: 145cd60fb481 ("mm, kfence: insert KFENCE hooks for SLUB") > > > > ------------[ cut here ]------------ > > WARNING: CPU: 0 PID: 11307 at net/core/stream.c:207 sk_stream_kill_queues+0x3c3/0x530 net/core/stream.c:207 > [...] > > Call Trace: > > inet_csk_destroy_sock+0x1a5/0x490 net/ipv4/inet_connection_sock.c:885 > > __tcp_close+0xd3e/0x1170 net/ipv4/tcp.c:2585 > > tcp_close+0x29/0xc0 net/ipv4/tcp.c:2597 > > inet_release+0x12e/0x280 net/ipv4/af_inet.c:431 > > __sock_release+0xcd/0x280 net/socket.c:596 > > sock_close+0x18/0x20 net/socket.c:1255 > > __fput+0x283/0x920 fs/file_table.c:280 > > task_work_run+0xdd/0x190 kernel/task_work.c:140 > > exit_task_work include/linux/task_work.h:30 [inline] > > do_exit+0xb89/0x29e0 kernel/exit.c:823 > > do_group_exit+0x125/0x310 kernel/exit.c:920 > > get_signal+0x3ec/0x2010 kernel/signal.c:2770 > > arch_do_signal_or_restart+0x2a8/0x1eb0 arch/x86/kernel/signal.c:811 > > handle_signal_work kernel/entry/common.c:144 [inline] > > exit_to_user_mode_loop kernel/entry/common.c:168 [inline] > > exit_to_user_mode_prepare+0x124/0x200 kernel/entry/common.c:198 > > syscall_exit_to_user_mode+0x36/0x260 kernel/entry/common.c:275 > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > I've been debugging this and I think enabling KFENCE uncovered that some > code is assuming that the following is always true: > > ksize(kmalloc(S)) == ksize(kmalloc(S)) > I do not think we make this assumption. Each skb tracks the 'truesize' which is populated from __alloc_skb() using ksize(allocated head) . So if ksize() decides to give us random data, it should be still fine, because we use ksize(buff) only once at alloc skb time, and record the value in skb->truesize (only the socket buffer accounting would be off) > but I don't think this assumption can be made (with or without KFENCE). > > With KFENCE, we actually end up testing no code assumes this, because > KFENCE's ksize() always returns the exact size S. > > I have narrowed it down to sk_wmem_queued becoming <0 in > sk_wmem_free_skb(). > > The skb passed to sk_wmem_free_skb() and whose truesize causes > sk_wmem_queued to become negative is always allocated in: > > | kmem_cache_alloc_node+0x140/0x400 mm/slub.c:2939 > | __alloc_skb+0x6d/0x710 net/core/skbuff.c:198 > | alloc_skb_fclone include/linux/skbuff.h:1144 [inline] > | sk_stream_alloc_skb+0x109/0xc30 net/ipv4/tcp.c:888 > | tso_fragment net/ipv4/tcp_output.c:2124 [inline] > | tcp_write_xmit+0x1dbf/0x5ce0 net/ipv4/tcp_output.c:2674 > | __tcp_push_pending_frames+0xaa/0x390 net/ipv4/tcp_output.c:2866 > | tcp_push_pending_frames include/net/tcp.h:1864 [inline] > | tcp_data_snd_check net/ipv4/tcp_input.c:5374 [inline] > | tcp_rcv_established+0x8c9/0x1eb0 net/ipv4/tcp_input.c:5869 > | tcp_v4_do_rcv+0x5d1/0x870 net/ipv4/tcp_ipv4.c:1668 > | sk_backlog_rcv include/net/sock.h:1011 [inline] > | __release_sock+0x134/0x3a0 net/core/sock.c:2523 > | release_sock+0x54/0x1b0 net/core/sock.c:3053 > | sk_wait_data+0x177/0x450 net/core/sock.c:2565 > | tcp_recvmsg+0x17ea/0x2aa0 net/ipv4/tcp.c:2181 > | inet_recvmsg+0x11b/0x5d0 net/ipv4/af_inet.c:848 > | sock_recvmsg_nosec net/socket.c:885 [inline] > | sock_recvmsg net/socket.c:903 [inline] > | sock_recvmsg net/socket.c:899 [inline] > | ____sys_recvmsg+0x2c4/0x600 net/socket.c:2563 > | ___sys_recvmsg+0x127/0x200 net/socket.c:2605 > | __sys_recvmsg+0xe2/0x1a0 net/socket.c:2641 > | do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 > | entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > I used the code below to add some warnings that helped narrow it down. > > Does any of this help explain the problem? Not yet :) tso_fragment() transfers some payload from one skb to another, and properly shifts this amount from src skb to dst skb (@buff) : sk_wmem_queued_add(sk, buff->truesize); buff->truesize += nlen; skb->truesize -= nlen; > > Thanks, > -- Marco > > ------ >8 ------ > > > diff --git a/include/net/sock.h b/include/net/sock.h > index e8d958ef3ea0..ef4837f3aba4 100644 > --- a/include/net/sock.h > +++ b/include/net/sock.h > @@ -35,6 +35,7 @@ > #ifndef _SOCK_H > #define _SOCK_H > > +#include > #include > #include > #include > @@ -1534,7 +1535,15 @@ static inline void sk_mem_uncharge(struct sock *sk, int size) > DECLARE_STATIC_KEY_FALSE(tcp_tx_skb_cache_key); > static inline void sk_wmem_free_skb(struct sock *sk, struct sk_buff *skb) > { > + bool bad = false; > + > sk_wmem_queued_add(sk, -skb->truesize); > + > + if (WARN_ON(READ_ONCE(sk->sk_wmem_queued) == -384)) { > + pr_info("wmem_queued=%d truesize=%u\n", sk->sk_wmem_queued, skb->truesize); > + bad = true; > + } > + > sk_mem_uncharge(sk, skb->truesize); > if (static_branch_unlikely(&tcp_tx_skb_cache_key) && > !sk->sk_tx_skb_cache && !skb_cloned(skb)) { > @@ -1544,6 +1553,9 @@ static inline void sk_wmem_free_skb(struct sock *sk, struct sk_buff *skb) > return; > } > __kfree_skb(skb); > + > + if (bad) > + (void)READ_ONCE(skb->truesize); /* UAF to let KASAN show where it was allocated */ > } > > static inline void sock_release_ownership(struct sock *sk) > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > index ffe3dcc0ebea..f365495819ee 100644 > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -208,6 +208,19 @@ struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask, > size = SKB_DATA_ALIGN(size); > size += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > data = kmalloc_reserve(size, gfp_mask, node, &pfmemalloc); > + > + if (is_kfence_address(data)) > + pr_info("kfence's ksize: %zu\n", ksize(data)); > + /* > + * BUG BUG > + * Hypothesis: The problem is that some code assumes that: > + * > + * ksize(kmalloc(S)) == ksize(kmalloc(S)) > + * > + * Note: If we force no KFENCE allocation for @data above, the warnings > + * disappear. KFENCE's ksize() always returns the exact size S. > + */ > + > if (!data) > goto nodata; > /* kmalloc(size) might give us more room than requested.