Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp2359460imm; Mon, 24 Sep 2018 03:03:33 -0700 (PDT) X-Google-Smtp-Source: ACcGV63J25gAGrD4CrxxGSv0wz1kzVEr8qMhzQXhjaNyFIFqqGALFO6JwiHtkBGc4uA5WnbKPWwo X-Received: by 2002:a63:f344:: with SMTP id t4-v6mr8952194pgj.428.1537783413481; Mon, 24 Sep 2018 03:03:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537783413; cv=none; d=google.com; s=arc-20160816; b=VDFYinWAac1z7IYe9C+5mCT6YVAsVmtDVmlNy/fPgep+pnHXlU/Zg6wNhFsk7K43zp jUR05u/idDxilTalOILYqYZYbDs005BFA3YbwRNLVr1HQwT2MCiycGkMvFEgdlEurgSn +vNQRejU7AxYXpc0fL6BYXykMGFYhuSmAoe0Ci2Tn+Qq3hiV6cqGxuGjSa4fuYGD5UcM j57P88pX2zbmO9bFxlbUc2Wbn9KdN93QutY+c1n/tc/n2lJsPIOeYWLxqpRp2+C5TKUM KS6PTjV7h7X+YxfEeRqhtZ1CGIwd6NJg3i23Nc1nbc5SL6pnz0/rn9/Y9FhYQHddGxhE axSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature; bh=rXSy+f/xFrIW3UekOR/sM4aOb0HGFwzohHwzU029xC0=; b=CYr0jaJlkNqdQULXniHoVueDYRbBSrTc3LP0RzVXCmgH+nI8XwP72WV0vWxcJqBr8t EzSyORJX3B0+c42aMhkqHVfKwtN/DnR8wh9eoqzJJMQDGuPouGM+XyYTDJMsGdiVzBsn ywbLHXHq2+onYGnn3wnCwy2T16/ztDzPQVGE9Mq8c68BshyvV0E9aW6ONGq7Rv6x8kNB c0dQxdZ4Y9+lbSV9ku4HbCfrdtQb9UJLJLf9BmFPmCl0LRgjQWfd5hu7QonOhxlB0L4Z n6XEGWACuiz3x3DzWKO2W3eWOAfXkQe2UMsAPijscbb6iDa2seKJ8n+L03dsA6j5ZOI4 F5og== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=otRFGpEG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 97-v6si6809187plb.495.2018.09.24.03.03.17; Mon, 24 Sep 2018 03:03:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=otRFGpEG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728488AbeIXPkm (ORCPT + 99 others); Mon, 24 Sep 2018 11:40:42 -0400 Received: from mail-it1-f195.google.com ([209.85.166.195]:54955 "EHLO mail-it1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726072AbeIXPkm (ORCPT ); Mon, 24 Sep 2018 11:40:42 -0400 Received: by mail-it1-f195.google.com with SMTP id f14-v6so9645544ita.4 for ; Mon, 24 Sep 2018 02:39:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=rXSy+f/xFrIW3UekOR/sM4aOb0HGFwzohHwzU029xC0=; b=otRFGpEGX1N0EQTG0XhhtPUtFNf3Hecq0RtQAKkm61avoNDsKhdhCfV3INsqfjyB8k 5nqhLa4Iw3QmelD2mp581c5tBv4tOgvndfTCLymo0TM1XkdRBWEoMXMXMWXaTARkZnsw wlZ5cZIU3b2fw7g2NtpeIUlsYZ95T1yH/R/9Q6lZw67U+hSColwvzl5tjcnkQvrlqcFR qqKR/31MO2DwFcloFTA0TppVIjiTELCVBTEDmi2xJQI3lVgKu8Z6v26M2qQQof9n3qwr FYHtSt1bY4XTphRe6dSlcV7TM/443gM1krKJEwt0ax0xclWl0BgfTkmqbc0/TPaChfH2 yh4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=rXSy+f/xFrIW3UekOR/sM4aOb0HGFwzohHwzU029xC0=; b=K8PBt+SBzcndCry6Tj60eBaOiwiac7ezvpwjt59/1nh45+EYd+eXYfVADF4c5mWHgG lzyWzeZW+ZtL1qWbmRogxIxG3tPzhbp2yY3kd5humbrGlNDCgz9CDo+NOTKdcAT8QPxW cfIab1jYo4114Jpz99wi1H1Ya0MjO7Z4yUEHUxxw9kB4BdouVHlnUg5qXO0+9ch12Wxl 2QNFImUSiZlNBY+OT4W9LwNlLR63goYlZMczy9DAAbcGQgMva65csrRAusY9j50zcmZw He8eyXoM/3IPoK/JdLw4CvsAgIKxcPqwN99dqx84fz2a5XxJAlrLTwEUxo7EW1vYgk4g QU+A== X-Gm-Message-State: APzg51DneqE0zTw+s5Y4vtjG7gbHTwqGnlSwaPtc+n0J4IKT/Qix2U/G eG2PKR6aY93iJViWkXZCKjR0+UwJQ4Nz0bI5Pm9DMQ== X-Received: by 2002:a02:4009:: with SMTP id n9-v6mr8752678jaa.19.1537781969376; Mon, 24 Sep 2018 02:39:29 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a02:ab8c:0:0:0:0:0 with HTTP; Mon, 24 Sep 2018 02:39:08 -0700 (PDT) In-Reply-To: References: <000000000000565ab805768bf006@google.com> <125732064.15444205.1537718529926.JavaMail.zimbra@redhat.com> <1040580049.15456466.1537740558279.JavaMail.zimbra@redhat.com> From: Dmitry Vyukov Date: Mon, 24 Sep 2018 11:39:08 +0200 Message-ID: Subject: Re: KMSAN: uninit-value in memcmp (2) To: Alexander Potapenko Cc: Vladis Dronov , syzbot , syzkaller-bugs , David Miller , Eric Dumazet , LKML , Networking , sunlianwen Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 24, 2018 at 8:53 AM, Alexander Potapenko wrote: > On Mon, Sep 24, 2018 at 12:09 AM Vladis Dronov wrote: >> >> Hello, Dmirty, >> >> Thank you for the reply. Can we please, discuss this further? > Hi Vladis, >> > You can see on dashboard that the last crash >> > for the second version (2) happened just few days ago. So this is a >> > different bug. > FWIW I've just double-checked that the reproducer provided by > syzkaller in the original message still triggers the report from the > original message in the latest KMSAN tree (which already contains the > __hw_addr_add_ex() fix from April). >> Well... yes and no. When I was looking at this bug (bug?id=088efeac32fd) I was looking >> at the report at "2018/05/09 18:55" (https://syzkaller.appspot.com/text?tag=CrashReport&x=141b707b800000), >> since it was the only report with a reproducer. This was my error. >> >> The error and the call trace in this report are: >> >> >>> >> BUG: KMSAN: uninit-value in memcmp+0x119/0x180 lib/string.c:861 >> CPU: 0 PID: 38 Comm: kworker/0:1 Not tainted 4.17.0-rc3+ #88 >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >> Workqueue: ipv6_addrconf addrconf_dad_work >> Call Trace: >> __dump_stack lib/dump_stack.c:77 [inline] >> dump_stack+0x185/0x1d0 lib/dump_stack.c:113 >> kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 >> __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683 >> memcmp+0x119/0x180 lib/string.c:861 >> __hw_addr_add_ex net/core/dev_addr_lists.c:61 [inline] >> __dev_mc_add+0x1fc/0x900 net/core/dev_addr_lists.c:670 >> dev_mc_add+0x6d/0x80 net/core/dev_addr_lists.c:687 >> igmp6_group_added+0x2db/0xa00 net/ipv6/mcast.c:662 >> ipv6_dev_mc_inc+0xe9e/0x1130 net/ipv6/mcast.c:914 >> addrconf_join_solict net/ipv6/addrconf.c:2103 [inline] >> addrconf_dad_begin net/ipv6/addrconf.c:3853 [inline] >> addrconf_dad_work+0x462/0x2a20 net/ipv6/addrconf.c:3979 >> process_one_work+0x12c6/0x1f60 kernel/workqueue.c:2145 >> worker_thread+0x113c/0x24f0 kernel/workqueue.c:2279 >> kthread+0x539/0x720 kernel/kthread.c:239 >> ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:412 >> >> Local variable description: ----buf@igmp6_group_added >> Variable was created at: >> igmp6_group_added+0x4a/0xa00 net/ipv6/mcast.c:650 >> ipv6_dev_mc_inc+0xe9e/0x1130 net/ipv6/mcast.c:914 >> <<< >> >> It is the same like in bug?id=3887c0d99aecb27d085180c5222d245d08a30806 >> which, after some more test, made me believe these bugs are duplicate >> and are fixed by the same commit. >> >> But let's look at another report at "2018/09/12 21:00" >> (https://syzkaller.appspot.com/text?tag=CrashReport&x=14f99b71400000) >> at the bug (bug?id=088efeac32fd), the one you've mentioned as >> "the last crash for the second version (2) happened just few days ago". >> >> Its error and the call trace are completely different: >> >> >>> >> BUG: KMSAN: uninit-value in memcmp+0x11d/0x180 lib/string.c:863 >> CPU: 0 PID: 6107 Comm: syz-executor4 Not tainted 4.19.0-rc3+ #45 >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >> Call Trace: >> __dump_stack lib/dump_stack.c:77 [inline] >> dump_stack+0x14b/0x190 lib/dump_stack.c:113 >> kmsan_report+0x183/0x2b0 mm/kmsan/kmsan.c:956 >> __msan_warning+0x70/0xc0 mm/kmsan/kmsan_instr.c:645 >> memcmp+0x11d/0x180 lib/string.c:863 >> dev_uc_add_excl+0x165/0x7b0 net/core/dev_addr_lists.c:464 >> ndo_dflt_fdb_add net/core/rtnetlink.c:3463 [inline] >> rtnl_fdb_add+0x1081/0x1270 net/core/rtnetlink.c:3558 >> rtnetlink_rcv_msg+0xa0b/0x1530 net/core/rtnetlink.c:4715 >> netlink_rcv_skb+0x36e/0x5f0 net/netlink/af_netlink.c:2454 >> rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4733 >> netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] >> netlink_unicast+0x1638/0x1720 net/netlink/af_netlink.c:1343 >> netlink_sendmsg+0x1205/0x1290 net/netlink/af_netlink.c:1908 >> sock_sendmsg_nosec net/socket.c:621 [inline] >> sock_sendmsg net/socket.c:631 [inline] >> ... >> Uninit was created at: >> ... >> slab_post_alloc_hook mm/slab.h:446 [inline] >> slab_alloc_node mm/slub.c:2718 [inline] >> __kmalloc_node_track_caller+0x9e7/0x1160 mm/slub.c:4351 >> __kmalloc_reserve net/core/skbuff.c:138 [inline] >> __alloc_skb+0x2f5/0x9e0 net/core/skbuff.c:206 >> alloc_skb include/linux/skbuff.h:996 [inline] >> netlink_alloc_large_skb net/netlink/af_netlink.c:1189 [inline] >> netlink_sendmsg+0xb49/0x1290 net/netlink/af_netlink.c:1883 >> sock_sendmsg_nosec net/socket.c:621 [inline] >> sock_sendmsg net/socket.c:631 [inline] >> ___sys_sendmsg+0xe70/0x1290 net/socket.c:2114 >> <<< >> >> This is a different bug. How come these 2 different reports for 2 different >> bugs have ended in the same syzkaller report (bug?id=088efeac32fd) ? > > I suspect this is because syzbot used the top stack frame as the > report signature. > There's a mechanism to ignore frames like memcmp() in the reports, not > sure why didn't it work in this case (maybe it just wasn't in place at > the time the report happened). >> One bug is fixed by the "net: fix uninit-value in __hw_addr_add_ex()" commit, >> the second one is not, but they are still in the same syzkaller report. >> >> This was the reason of my confusion. I'm not sure how to fix this. If it is possible, >> probably we need to cancel/revoke "#syz fix: net: fix uninit-value in __hw_addr_add_ex()" >> for this syzkaller report (bug?id=088efeac32fd). And then "split" it into 2 or >> more different reports, but I'm not sure if this is possible. >> >> Probably, syzkaller needs to look deeper into the KMSAN reports to differentiate >> KMSAN errors happening because of different reasons. >> >> > On Sun, Sep 23, 2018 at 6:02 PM, Vladis Dronov wrote: >> > > #syz fix: net: fix uninit-value in __hw_addr_add_ex() >> > >> > Hi Vladis, >> > >> > This can be fixed with "net: fix uninit-value in __hw_addr_add_ex()". >> > That commit landed in April, syzbot waited till the commit reached all >> > tested trees, and then closed the bug. >> > But the similar bug continued to happen, so syzbot created second >> > version of this bug (2). You can see on dashboard that the last crash >> > for the second version (2) happened just few days ago. So this is a >> > different bug. Precisely discriminating bugs (root causes) bases on crash text is generally undecidable problem, even for humans. We even can have literally equal crash texts, which are still different bugs. And we can have significantly differently looking crash texts, which are actually caused by the same root cause. syzbot extracts some "identity" string for each crash and than uses that string to discriminate crashes and sort them into bins. This identity string is what you see in email subject and bug title on dashboard. This method can have both false positives and false negatives, but works reasonably well in most cases and looks like the best practical option. For this exact instance (memcmp) we actually improved the analysis logic recently: https://github.com/google/syzkaller/commit/0e29942f77715486995d996f80f82742812d75a2#diff-abe1515f011fad2659ff218f9eea9ae1 But this crash was analyzed and reported before the change. So if this crash happens again it should be reported as "in __hw_addr_add_ex" now. Re __hw_addr_add_ex bug, as Alex noted the crash was detected _after_ the fixing commit went in. So it's something new and different and can't be fixed by the older commit. There are no general, single guideline as to what to do when several different bugs glued together into a single bug. Fixing at least one of them (any) in the context of the bug is good, fixing both is good too. When/if a bug is closed, new occurrences of similar crashes (the same identity string) will lead to creation of a new bug. So if we fix only one and close the bug, eventually the second one will lead to a new bug (won't be lost), now dedicated to this second crash. Now syzbot thinks that this bug is fixed/closed: https://syzkaller.appspot.com/bug?extid=d3402c47f680ff24b29c There is specifically no "undo" functionality, because it's inherently racy with creation of a new version of this bugs by new crashes. So if of these crashes will happen again, syzbot will open new bugs (now with better discriminated titles). We can wait for that. Or we can submit new fixes without waiting for new syzbot bugs (adding Reported-by to new commits referencing this bug should not do any harm). Hope this clarifies things a bit. Thanks