Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp3671704yba; Tue, 23 Apr 2019 07:44:29 -0700 (PDT) X-Google-Smtp-Source: APXvYqw7SqxS6cl3A32sva5PNBrlGEWuNPbHdRZB2N/rbJ+DD1C+8XYpNRwEehGLrVmwtgdXkUE1 X-Received: by 2002:a62:5fc7:: with SMTP id t190mr26935235pfb.191.1556030668908; Tue, 23 Apr 2019 07:44:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556030668; cv=none; d=google.com; s=arc-20160816; b=ioirK67G8W9CPYhxfKE9CwRMmlu64bschiwO9/b5FFEdqheZgWkfrOQ/JHFFr8DZUu 788/AwIV/VUZ6qxpOiQipAdq9NS2LHdFoychiftsF7EBE7FN606db7qD4jiRfdlI0zeo H7C/zcH3u22RsUbjxX7g/W+kihHE7i8qEG2l6T4xoQAg/egJDzrb76nM2LrDK3GdM44i YE4sypXMPl8UioPlZMLfdbAnKiyFZ2gIbFauKGPiZHs7dWb1gb38o5dcoEZ3rXHuatrz WAKM96jNwh2jKMJIRb3b2ThBW4iU4hPMSzAO8u0tJETo/jy9nmTqkVe66NDr2tetwqFn xruw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=n96aQRl9RFy0hPdQk7Xa/MQx25bn/TZiR6m6PTgCGiY=; b=iCZjMqCZqt4FauRXwFvufPRMSrSCiceY5ML/SqQQoH/B+fS0kd49DecAWuI5zIkvZX Pwr1/Q1/R1rAsw9v0HzVCzcWzU7VNAPCw5AO+ZMjfJXgj3awkvRhLV0psWPAjCb41Thx lNMWlqTk5vLC7V+z64DZvgSwM5IBN6U8mPJiZAepMWo77CXpMw8cMNrL8U/SnJFEvupw gmSMZyBTVkbvBBjoB8shn8BwIld2z2RXjY08BO9G3XSXSbPL3cVnJzf2bGex06odBOs/ uUNYiBh8/Rjj5yP08uqwkTqygaOPocdF97rAqzOYVZSoKvGJL15guMkjztR9zeZp/9gx IiXQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=sRuMlwSJ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 73si15613646pgb.414.2019.04.23.07.44.13; Tue, 23 Apr 2019 07:44:28 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=sRuMlwSJ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728307AbfDWOlx (ORCPT + 99 others); Tue, 23 Apr 2019 10:41:53 -0400 Received: from mail-it1-f196.google.com ([209.85.166.196]:52627 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728005AbfDWOlw (ORCPT ); Tue, 23 Apr 2019 10:41:52 -0400 Received: by mail-it1-f196.google.com with SMTP id x132so534405itf.2 for ; Tue, 23 Apr 2019 07:41:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=n96aQRl9RFy0hPdQk7Xa/MQx25bn/TZiR6m6PTgCGiY=; b=sRuMlwSJj3tx8x7QtHL0m2YRLbf8LMnIVSOaWhe5upFpwFFoEpQ70YawhsWMfzx6HA F33OKNgxrugpe5kM4pTTs6KZzuTTmw4+8OfsUVqhTAb/RMMrlgvwxijHsUPxbrWeO5F2 KLKPdBp9nPB518gASp1WibOP5SPyTtoArxz1SUkjIo1+07Kh/qDEGAJVoDQR1i0PuHPc igfP3GyDnQrZKuuiYUjZNOtuINYWem4u2I4pTWE+2SHFE0GneeQXaTLWmBLKlpTWGksz cpn8KEs7QvA5xJR6PkzgzeS+ny6gbwIz1k1QRHky+jFg4ZvNVDoeyv+nwAkXXp92JRxm vHRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=n96aQRl9RFy0hPdQk7Xa/MQx25bn/TZiR6m6PTgCGiY=; b=Wg4MNHVSSd9lHDMKrtP1wfmZzRE5coh3prc4g/ajk4RiEq53bibeVV/J9c11z2C7hF p4XHjGqVS7QyhCobSt8lMG2e3w6qntMGBe/EQNL1MDxvVPXpu2iyDWHvpfmJ8Fceukyg f1jm4749vVUq5IHUoIN/POfk1rdL6FsOVkpwp+H2S5yCNyuF1y/7loM8qZEqa9obxh6X Fx0UwZcdCpdviQyb6nmd6RKbh4wkeubHDp7Vcnvy6q5KKQJsFc+Bpbo+tf0scbgWdarY l7GKZ5qoOmCRyXvK3bUgEPGIbfVPErtRwb/0A1mxFgeOTfjFarZwCIfNO5EkjEFGRAyx Dqew== X-Gm-Message-State: APjAAAV6neCYS86u5wW1jpwlX/jvs9Me/gE7V4ASMI6ra3uK+SCm8eaF M992zx15jfpTGswWPMZWCbk5q6XQOAodjEarQ8t/hg== X-Received: by 2002:a02:c043:: with SMTP id u3mr8633027jam.35.1556030511715; Tue, 23 Apr 2019 07:41:51 -0700 (PDT) MIME-Version: 1.0 References: <00000000000043fe9c058720a5d3@google.com> <53a17444-9539-5810-82a0-ceeefa742508@kernel.dk> In-Reply-To: From: Dmitry Vyukov Date: Tue, 23 Apr 2019 17:41:40 +0300 Message-ID: Subject: Re: WARNING in percpu_ref_kill_and_confirm To: Linus Torvalds Cc: Jens Axboe , syzbot , Arnd Bergmann , Borislav Petkov , "Darrick J. Wong" , Greg Kroah-Hartman , Peter Anvin , Linux API , linux-arch , linux-block , linux-fsdevel , Linux List Kernel Mailing , Andrew Lutomirski , Mathieu Desnoyers , Ingo Molnar , Michael Ellerman , syzkaller-bugs , Thomas Gleixner , Al Viro , "the arch/x86 maintainers" , syzkaller Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 22, 2019 at 7:48 PM Linus Torvalds wrote: > > On Mon, Apr 22, 2019 at 9:38 AM Jens Axboe wrote: > > > > With the mutex change in, I can trigger it in a second or so. Just ran > > the reproducer with that change reverted, and I'm not seeing any badness. > > So I do wonder if the bisect results are accurate? > > Looking at the syzbot report, it's syzbot being confused. > > The actual WARNING in percpu_ref_kill_and_confirm() only happens with > recent kernels. > > But then syzbot mixes it up with a completely different bug: > > crash: BUG: MAX_STACK_TRACE_ENTRIES too low! > BUG: MAX_STACK_TRACE_ENTRIES too low! > > and for some reason decides that *that* bug is the same thing entirely. > > So yeah, I think the simple percpu_ref_is_dying() check is sufficient, > and that the syzbot bisection is completely bogus. Using crashed/not-crashed predicate gives better results overall. More than half kernel bugs have different manifestations due to different reasons. And even if we can say for sure that we see a different bug, we still don't know if the original bug is also there or not. See the following threads for details: https://groups.google.com/d/msg/syzkaller-bugs/nFeC8-UG1gg/y6gUEsvAAgAJ https://groups.google.com/d/msg/syzkaller/sR8aAXaWEF4/tTWYRgvmAwAJ Unrelated crashes is the most common cause of incorrect bisection results (66%). To enable better bisection we would need to integrate some meaningful precommit testing into kernel development process (would be tremendously useful for other reasons too). E.g. this "BUG: MAX_STACK_TRACE_ENTRIES too low!" is this: https://syzkaller.appspot.com/bug?id=dbd70f0407487a061d2d46fdc6bccc94b95ce3c0 and the reproducer is simply opening /dev/infiniband/rdma_cm or /dev/vhci or something equally simple with LOCKDEP enabled. None of this was done in a testing environment for several weeks. And then it took another month to propagate the fix through all distributed kernel trees. For all that time simple programs crash and bisection can't be done and we are spending time here...