Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp2215949ybz; Sun, 26 Apr 2020 13:48:45 -0700 (PDT) X-Google-Smtp-Source: APiQypIbJ6Rg9xgmuffP1sBaeB2BosOJYCem4daOGtqIs/MDl7lhD+NG4mdjwd86NnrYvV2jUWB5 X-Received: by 2002:a05:6402:1d02:: with SMTP id dg2mr16381701edb.5.1587934125050; Sun, 26 Apr 2020 13:48:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587934125; cv=none; d=google.com; s=arc-20160816; b=Wjp07U4jxzhvQmJLP7n5WMEC8Q7x7erKViAWBz7tQvAWcz8+PXMpCJo0PJEmAh0xbZ qBSjTLmGA+yY0VVkvufvv9B5tm95AqNTwMAeHfcNCNjoRAmBkKplTGtrR8mrHNM4ACyh PbxPvBfkSkem1NzZICXTg3QQ51+TluagLrkAxl4YzlGBOxwSBhwnbJ+J4cDvJRJJcJxf ++dBNyr8VLrGcNB31lMwWXPKp08zwqoViE3e4ZfdsRt3QXAAyvKy5vZSkh/YckCNwA7A qThpZPMy94qX3p1klYHaXTxjz+J2Y7Fjevod+oWZBb2JxIiZJ3o3/+zeVqEeIZml1bRI M4/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=V13YkgBRrBnRQKDYMIuEzBwoVNvkalpPFUeUTqYd3jI=; b=Ir5OblXNv+fj5WyXyfbzR3ntxsfzwArnGamlfFk4rhNZ+2onCgIMakDfYhfqkNMQUj hGoGmSfjSq6vn/g4+q3MuDfGX67lDrZIvgGwoXxaFmOCUdNqaWJoJCxcV24bQgG+AQ19 tUnqLpoOaeQllrX+l235/mAvX+R5bIqZx9Npf8sf8/TUBUBbG9gZRlZLpJAzBZC3DF0q uijbY9WZRJidf/LMOTLGaGPNf6wBz0twW5ZMtOJpS/hsiwt1yXQiRrVB666vAcr80PHh tR5aEJ3oFw3z2ctPRUpiNY5d0umJCUNmR/WdZH61F2bLi720Ea1XtkQ4x4TzhiHZEb/7 +MmA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@zx2c4.com header.s=mail header.b=M93KN1im; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zx2c4.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a23si6555014edy.27.2020.04.26.13.48.21; Sun, 26 Apr 2020 13:48:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@zx2c4.com header.s=mail header.b=M93KN1im; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zx2c4.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726285AbgDZUqW (ORCPT + 99 others); Sun, 26 Apr 2020 16:46:22 -0400 Received: from mail.zx2c4.com ([192.95.5.64]:33857 "EHLO mail.zx2c4.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726176AbgDZUqV (ORCPT ); Sun, 26 Apr 2020 16:46:21 -0400 Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTP id 39088fed; Sun, 26 Apr 2020 20:34:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=zx2c4.com; h=mime-version :references:in-reply-to:from:date:message-id:subject:to:cc :content-type; s=mail; bh=Ugp9J23dMwybD+BDDgKHZnVK/uQ=; b=M93KN1 ims8o7fabG8/3QNssmzDkhKoonmWf+LPFGKE32qJUrW8Z9fiuA4J1XsJpPmcsRAe 0d3EuIEyyX5VXdKVQku9Y0zOjGz/R3v80OKwPryYM3+igeWXQZzg0B7JSDJ12R8v hQzwhSATbPnA+oERLyN7UFXlq5lG0tBKLt2pv/DHDzcWGFVfAZY/1OZs7aGYS42O yR2RuBpCvQGOVUOZsSBTqNIQnrjVeKIudHLjWUq1cq7sBJqej2bZOGLsseywhglc nGlKv8ckvQeOI3PFGBtshQF1H7jhSjxGKxXrwIMWYMfPDbafq89M8d5lEqqvhN4v 2CV87IyEaw1TS8UQ== Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id 7397de7a (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Sun, 26 Apr 2020 20:34:52 +0000 (UTC) Received: by mail-io1-f41.google.com with SMTP id o127so16713472iof.0; Sun, 26 Apr 2020 13:46:17 -0700 (PDT) X-Gm-Message-State: AGi0PuZOiASB2NqESmK4+wEVWpZD+T8wGgWAgS5nK1EMZTJRuHRJPWtr tDs6wHVTdsaDhIuRrnH1jQqewATj21n9uocCyjE= X-Received: by 2002:a02:b88e:: with SMTP id p14mr16955187jam.36.1587933976357; Sun, 26 Apr 2020 13:46:16 -0700 (PDT) MIME-Version: 1.0 References: <0000000000005fd19505a4355311@google.com> <29bd64f4-5fe0-605e-59cc-1afa199b1141@gmail.com> In-Reply-To: <29bd64f4-5fe0-605e-59cc-1afa199b1141@gmail.com> From: "Jason A. Donenfeld" Date: Sun, 26 Apr 2020 14:46:05 -0600 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: INFO: rcu detected stall in wg_packet_tx_worker To: Eric Dumazet Cc: syzbot , David Miller , Florian Fainelli , Greg Kroah-Hartman , jhs@mojatatu.com, =?UTF-8?B?SmnFmcOtIFDDrXJrbw==?= , Krzysztof Kozlowski , kuba@kernel.org, kvalo@codeaurora.org, leon@kernel.org, LKML , linux-kselftest@vger.kernel.org, Netdev , Shuah Khan , syzkaller-bugs@googlegroups.com, Thomas Gleixner , vivien.didelot@gmail.com, Cong Wang Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Apr 26, 2020 at 2:38 PM Eric Dumazet wrote: > > > > On 4/26/20 1:26 PM, Eric Dumazet wrote: > > > > > > On 4/26/20 12:42 PM, Jason A. Donenfeld wrote: > >> On Sun, Apr 26, 2020 at 1:40 PM Eric Dumazet wrote: > >>> > >>> > >>> > >>> On 4/26/20 10:57 AM, syzbot wrote: > >>>> syzbot has bisected this bug to: > >>>> > >>>> commit e7096c131e5161fa3b8e52a650d7719d2857adfd > >>>> Author: Jason A. Donenfeld > >>>> Date: Sun Dec 8 23:27:34 2019 +0000 > >>>> > >>>> net: WireGuard secure network tunnel > >>>> > >>>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=15258fcfe00000 > >>>> start commit: b2768df2 Merge branch 'for-linus' of git://git.kernel.org/.. > >>>> git tree: upstream > >>>> final crash: https://syzkaller.appspot.com/x/report.txt?x=17258fcfe00000 > >>>> console output: https://syzkaller.appspot.com/x/log.txt?x=13258fcfe00000 > >>>> kernel config: https://syzkaller.appspot.com/x/.config?x=b7a70e992f2f9b68 > >>>> dashboard link: https://syzkaller.appspot.com/bug?extid=0251e883fe39e7a0cb0a > >>>> userspace arch: i386 > >>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15f5f47fe00000 > >>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11e8efb4100000 > >>>> > >>>> Reported-by: syzbot+0251e883fe39e7a0cb0a@syzkaller.appspotmail.com > >>>> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") > >>>> > >>>> For information about bisection process see: https://goo.gl/tpsmEJ#bisection > >>>> > >>> > >>> I have not looked at the repro closely, but WireGuard has some workers > >>> that might loop forever, cond_resched() might help a bit. > >> > >> I'm working on this right now. Having a bit difficult of a time > >> getting it to reproduce locally... > >> > >> The reports show the stall happening always at: > >> > >> static struct sk_buff * > >> sfq_dequeue(struct Qdisc *sch) > >> { > >> struct sfq_sched_data *q = qdisc_priv(sch); > >> struct sk_buff *skb; > >> sfq_index a, next_a; > >> struct sfq_slot *slot; > >> > >> /* No active slots */ > >> if (q->tail == NULL) > >> return NULL; > >> > >> next_slot: > >> a = q->tail->next; > >> slot = &q->slots[a]; > >> > >> Which is kind of interesting, because it's not like that should block > >> or anything, unless there's some kasan faulting happening. > >> > > > > I am not really sure WireGuard is involved, the repro does not rely on it anyway. > > > > Yes, do not spend too much time on this. > > syzbot found its way into crazy qdisc settings these last days. > > ( I sent a patch yesterday for choke qdisc, it seems similar checks are needed in sfq ) Ah, whew, okay. I had just begun instrumenting sfq (the highly technical term for "adding printks everywhere") to figure out what's going on. Looks like you've got a handle on it, so I'll let you have at it. On the brighter side, it seems like Dmitry's and my effort to get full coverage of WireGuard has paid off in the sense that tons of packets wind up being shoveled through it in one way or another, which is good.