Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp1060227ybh; Tue, 10 Mar 2020 13:48:27 -0700 (PDT) X-Google-Smtp-Source: ADFU+vvmNZpq+rbfzucv7Ca63PbWhWCRJF3QLQRWYEOAUJd3IMmnB7B2gVyfmLOngCD6lkoMy1o0 X-Received: by 2002:aca:57d6:: with SMTP id l205mr2522237oib.20.1583873307106; Tue, 10 Mar 2020 13:48:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1583873307; cv=none; d=google.com; s=arc-20160816; b=FuGXC9TO4pQbaG5BylMhA37SmuURLm9Q/EP5IRuYnXbk78Fanusupq7r6q6x7rcTQp uXIrkUfAa10QDukP7TSIZQOJodtkXiMVsbETRnJH175l2FZ8KPFr6LMIJlCDhXrV6D2u d0g+4bX+NpiHZMBMKAVKAcYhp4LKx/FK80d5dR1ddYgvt9pICzY59ZMBCMFdjzxUE4eM KXZUSKDtsd78/VLXSP5FCaJhpUl4frhKXpNL91A6GR3EIAg50aZRTTLlchqRvoAlnSZj 40tjrwbm3S89prgD1p6lhTCTQK/Syt2I8jddJLIuj6bUYJCTyXjaBcMA6gr88Nz6Z77j VExg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=dHDyU6G/3PWCtvFuoOre6k3J3Ww9bHTi6I8bhr1RLW0=; b=IoU8X4mfZn16UD7ncnWQMIH5sNNbb6b/6/COtcDsm1UNgcV26PeZoVDO17nPiafGYA M73OKcMs7ltNjXdleZp8DSVkYF2oxRrU0ErYewAb6aqjxieAB4E+2u+MRquUG0bgKM+K /zySVYwu7+HsZWYHWFbZBHHJ+Ykw+Ib6UHyKVtkQ25PmXI3jLyGh0/JKtswc4q5uiB/h yixwcFe9+PEIByCQUf8t1wvjK3Mca3FemsO56ioyqUol9ZGA0VfdMy0xaB0TTg3kWfB8 bZKsPYUgRbnRNjeORDSYZZW9pZr9/FjU6BLcI4zdx0mJFoPxrGSh+G6/fc9kvy1tbvL9 roCA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=tDPEiTZK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e16si4500796ote.190.2020.03.10.13.48.14; Tue, 10 Mar 2020 13:48:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=tDPEiTZK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727673AbgCJUqi (ORCPT + 99 others); Tue, 10 Mar 2020 16:46:38 -0400 Received: from mail-pf1-f194.google.com ([209.85.210.194]:46031 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726271AbgCJUqh (ORCPT ); Tue, 10 Mar 2020 16:46:37 -0400 Received: by mail-pf1-f194.google.com with SMTP id 2so4565pfg.12 for ; Tue, 10 Mar 2020 13:46:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=dHDyU6G/3PWCtvFuoOre6k3J3Ww9bHTi6I8bhr1RLW0=; b=tDPEiTZKCuFr5c8YQDWtpPPSo7JCZb7nKqNKQNeGP9nOeNUhC90ePx39hzySMLqK33 QzIq0KL8ZRIeqbeB69KpcCB/RMTclqIJDfoRG0tobaaKa3nCqZHXSag47wP3cM53mHE9 wjFx9vgpyPPGqS1xQjAq7xWXCiq1KdfE9+OPhd6FTtG3ypRW6f6l3mul4lcEF66DpTI7 7n84nB29mXG0j8Stfr4L8H0zuvTkCXhA0s3QPsBzE6BsrCprl6n+K4Fu76Lap0hu23mt NHIONPSkJEJqLsIZZumc0N+JWzB88aTI705HtK0CPuUHIq2SzvribqanW8ct/ZwcCQnL ne+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=dHDyU6G/3PWCtvFuoOre6k3J3Ww9bHTi6I8bhr1RLW0=; b=tRZVaiyJhQvJCoXhmHgLJLVt7Uful5Fmp8FiAGY0Yo4+p7UK35ckPYSKmDei0Z1PJ0 IoxAgeCZV5lM9urxbV/dcBgh+L+CRc1PPng48Lhf64YueozBDKi5sCQRGW+aOkf2YYJW vFe/XTH5FX6VK8sjXAkQ2t+UpfIrbJDzdAU8oVe5ivVMk1V3UJDw1B6f3deSKM9tbEhY ZZvf7aP9+5zz5J8ym2K82cGNmBGCDmduqUtqYcu4o0Sdd/O4rjvyyrdbtxhQzF5lowQU //ffMsc6RLP562IpX9a+C35osGuh511fgm/Jq+3ljTG2Cy7jHYcEryeGlcwM9mOTlzi8 H/Ew== X-Gm-Message-State: ANhLgQ22unPWymXoE0f3ud8uvRmivinrQ/XA3gofWwTIGQxzIaUnMDeD 2BGWpHKID5jIUNRhu8aaVcJ4sl/1U6jRONXAEfM8SQ== X-Received: by 2002:a63:650:: with SMTP id 77mr16283706pgg.201.1583873195939; Tue, 10 Mar 2020 13:46:35 -0700 (PDT) MIME-Version: 1.0 References: <20200305223350.GA2852@mara.localdomain> <20200306120525.GC68079@kuha.fi.intel.com> <20200310111837.GA1368052@kuha.fi.intel.com> In-Reply-To: <20200310111837.GA1368052@kuha.fi.intel.com> From: Brendan Higgins Date: Tue, 10 Mar 2020 13:46:24 -0700 Message-ID: Subject: Re: BUG: kernel NULL pointer dereference, address: 00 - ida_free+0x76/0x140 To: Heikki Krogerus Cc: Sakari Ailus , Andy Shevchenko , hdegoede@redhat.com, "rafael.j.wysocki" , Naresh Kamboju , open list , "open list:KERNEL SELFTEST FRAMEWORK" , Steven Rostedt , Sergey Senozhatsky , Andy Shevchenko , Shuah Khan , Anders Roxell , lkft-triage@lists.linaro.org, Rasmus Villemoes Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 10, 2020 at 4:18 AM Heikki Krogerus wrote: > > On Mon, Mar 09, 2020 at 02:43:13PM -0700, Brendan Higgins wrote: > > On Mon, Mar 9, 2020 at 1:35 PM Brendan Higgins > > wrote: > > > > > > On Fri, Mar 6, 2020 at 4:05 AM Heikki Krogerus > > > wrote: > > > > > > > > On Fri, Mar 06, 2020 at 12:33:50AM +0200, Sakari Ailus wrote: > > > > > Hi Brendan, > > > > > > > > > > On Thu, Mar 05, 2020 at 11:51:20AM -0800, Brendan Higgins wrote: > > > > > > On Thu, Mar 5, 2020 at 11:40 AM Brendan Higgins > > > > > > wrote: > > > > > > > > > > > > > > On Thu, Mar 5, 2020 at 11:18 AM Andy Shevchenko > > > > > > > wrote: > > > > > > > > > > > > > > > > +Cc: Sakari > > > > > > > > > > > > > > > > On Thu, Mar 5, 2020 at 6:00 PM Naresh Kamboju wrote: > > > > > > > > > > > > > > > > > > Regression reported on Linux next 5.6.0-rc4-next-20200305 on x86_64, > > > > > > > > > i386, arm and arm64. The steps to reproduce is running kselftests lib > > > > > > > > > printf.sh test case. > > > > > > > > > Which is doing modprobe operations. > > > > > > > > > > > > > > > > > > BTW, there are few RCU warnings from the boot log. > > > > > > > > > Please refer below link for more details. > > > > > > > > > > > > > > > > > > Steps reproduce by using kselftests, > > > > > > > > > > > > > > > > > > - lsmod || true > > > > > > > > > - cd /opt/kselftests/default-in-kernel/lib/ > > > > > > > > > - export PATH=/opt/kselftests/default-in-kernel/kselftest:$PATH > > > > > > > > > - ./printf.sh || true > > > > > > > > > - ./bitmap.sh || true > > > > > > > > > - ./prime_numbers.sh || true > > > > > > > > > - ./strscpy.sh || true > > > > > > > > > > > > > > > > > > x86_64 kernel BUG dump. > > > > > > > > > + ./printf.sh > > > > > > > > > > > > > > Oops, I am wondering if I broke this with my change "Revert "software > > > > > > > node: Simplify software_node_release() function"": > > > > > > > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d1c19322388d6935b534b494a2c223dd089e30dd > > > > > > > > > > > > > > I am still investigating, will update later. > > > > > > > > > > > > Okay, yeah, I am pretty sure I caused the breakage. I got an email > > > > > > from kernel test robot a couple days ago that I didn't see: > > > > > > > > > > > > https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/N3ZN5XH7HK24JVEJ5WSQD2SK6YCDRILR/ > > > > > > > > > > > > It shows the same breakage after applying this change. > > > > > > > > > > > > I am still investigating how my change broke it, nevertheless. > > > > > > > > > > As nodes in the tree are being removed, the code before the patch that > > > > > "simplified" the software_node_release() function accessed the node's parent > > > > > in its release function. > > > > > > > > > > And if CONFIG_DEBUG_KOBJECT_RELEASE is defined, the release functions are no > > > > > longer necessarily called in order, leading to referencing released memory. > > > > > Oops! > > > > > > > > > > So Heikki's patch actually fixed a bug. :-) > > > > > > > > Well, I think it just hid the problem. It looks like the core > > > > (lib/kobject.c) allows the parent kobject to be released before the > > > > last child kobject is released. To be honest, that does not sound > > > > right to me... > > > > > > > > I think we can workaround this problem by taking reference to the > > > > parent when the child is added, and then releasing it when the child > > > > is released, and in that way be guaranteed that the parent will not > > > > disappear before the child is fully released, but that still does not > > > > feel right. It feels more like the core is not doing it's job to me. > > > > The parent just should not be released before its children. > > > > > > > > Either I'm wrong about that, and we still should take the reference on > > > > the parent, or we revert my patch like Brendan proposed and then fix > > > > > > Either way, isn't it wrong to release the node ID before deleting the > > > sysfs entry? I am not sure that my fix was the correct one, but I > > > believe the bug that Heidi and I found is actually a bug. > > I agree. > > > > > the core with something like this (warning, I did not even try to > > > > compile that): > > > > > > I will try it out. > > > > > > > diff --git a/lib/kobject.c b/lib/kobject.c > > > > index 83198cb37d8d..ec5774992337 100644 > > > > --- a/lib/kobject.c > > > > +++ b/lib/kobject.c > > > > @@ -680,6 +680,12 @@ static void kobject_cleanup(struct kobject *kobj) > > > > kobject_uevent(kobj, KOBJ_REMOVE); > > > > } > > > > > > > > + if (t && t->release) { > > > > + pr_debug("kobject: '%s' (%p): calling ktype release\n", > > > > + kobject_name(kobj), kobj); > > > > + t->release(kobj); > > > > + } > > > > + > > > > /* remove from sysfs if the caller did not do it */ > > > > if (kobj->state_in_sysfs) { > > > > pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n", > > > > @@ -687,12 +693,6 @@ static void kobject_cleanup(struct kobject *kobj) > > > > kobject_del(kobj); > > > > } > > > > > > > > - if (t && t->release) { > > > > - pr_debug("kobject: '%s' (%p): calling ktype release\n", > > > > - kobject_name(kobj), kobj); > > > > - t->release(kobj); > > > > - } > > > > - > > > > /* free name if we allocated it */ > > > > if (name) { > > > > pr_debug("kobject: '%s': free name\n", name); > > > > Alright, so I tried it and it looks like Heikki's suggestion worked. > > > > Is everyone comfortable going this route? > > Hold on. Another way to fix the problem is to increment the parent's > reference count before that kobject_del(kobj) is called, and then > decrementing it after t->release(kobj) is called. It may be safer to > fix the problem like that. Right, this was your first suggestion above, right? That actually made more sense to me, but you seemed skeptical of it due to it being messier, which makes sense. Nevertheless, having children take a reference seems like the right thing to do because the children need to degregister themselves from the parent. Calling t->release() ahead of kobject_del() seems to reintroduce the problem that I pointed out, albeit *much* more briefly. If I understand correctly, it is always wrong to have a sysfs entry that points to a partially deallocated kobject. Please correct me if I am wrong. So I think there are two solutions: Either we have to ensure that each child is deallocated first so we can preserve the kobject_del() and then t->release() ordering, or we have to add some sort of "locking" mechanism to prevent the kobject from being accessed by anything other than the deallocation code until it is fully deallocated; well, it would have to prevent any access at all :-). I think it goes without saying that this "locking" idea is pretty flawed. The problem with just having children take a reference is that the kobject children already take a reference to their parent, so it seems like the kobject should be smart enough to deallocate children rather than having swnode have to keep a separate tally of children, no? Sorry if this all seems obvious, I am not an expert on this part of the kernel. > My example above proofs that there is the problem, but it changes the > order of execution which I think can always have other consequences. > > > Also, should I send this fix as a separate patch? Or do people want me > > to send an updated revision of my revert patch with the fix? > > This needs to be send in its own separate patch. Ideally it could be > send together with the revert in the same series, but I'm not sure > that's possible anymore. Didn't Greg pick the revert already? Sounds good. I did already let Greg know when he emailed us on backporting the patch to stable, and he acked saying he removed them. So as long as these are not in the queue for 5.6 (it is not in Linus' tree yet), we should be good. Cheers