Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp1265300pxv; Fri, 23 Jul 2021 04:16:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzVgvfKNqepVzoUrpczd+133W41w3DsdDIsASDr22zbxhHOPUkDB6XW0B7IocUM4dpsSZwe X-Received: by 2002:a05:6402:1768:: with SMTP id da8mr4947780edb.43.1627038984127; Fri, 23 Jul 2021 04:16:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627038984; cv=none; d=google.com; s=arc-20160816; b=XJzJJHMRTrl6O/bG85ANIrIRmsE3uzmYDAJvSkfMLni6YhSjDvQYmN1klVRnUhmlw3 tjI7S2PnXzFs2mH7Ym+3KAF+a1XPpc41khq9qqw06+VA8Jl/eZw8MDQ4i+Y1XzDC5yh6 WfwSzGZncUBRMF0w6hBt7M0cnhIy8O4wanBRWwRjVHHMQM+J5qQj70hiO38yliNAxcI4 kEuAKwKZaWlUdvNl8KvivRwMn7vROMxD6N2npGjqdIUfJK8cv12HCwAybHnIZRAHCCvU SH0elOicgksCNUWsUTxicZnkSLdBKEJ7ITi6GgDUUEi7him+hOt443hSXgiWJr4LQ93c 4UNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=MF/CTREvLZjUvtRKGEer/ZNPq6qwf27KnPgqk5sSLgQ=; b=VQmkl1BUU+AuN2GuidTlrN9EpSWiVKQy0ex0gs4PubotPrcMKj6rkx1YOHHHTN0dPh FTFSVKDp9KM2Z09sh+FAGIqUkZUP3Qcr7zxMkYRb93WUCteqh2hVaYWsAASxhzW0oCqT B4E7PbjVcBTtfrGiobujKNHUrcc161GiJXcVgDMot939VY3xVZ8vwZhcDK8hicKI1pUE +M/E9smy3P4480gt1o53PxzFXQqiEcosc2z6atKEMo0UUDef643DqryJTo2h/YrvM6gX sMFoFiqddn9Ez9SuwyVOTejGzWIKeygno0sx09//TR/PJRPVobAQyFrptz3SUwK7bvdI 0ZNg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=gcm1Bu58; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bo3si33173163edb.322.2021.07.23.04.16.00; Fri, 23 Jul 2021 04:16:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=gcm1Bu58; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234289AbhGWKdk (ORCPT + 99 others); Fri, 23 Jul 2021 06:33:40 -0400 Received: from mail.kernel.org ([198.145.29.99]:60130 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231912AbhGWKdj (ORCPT ); Fri, 23 Jul 2021 06:33:39 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 9F54A60EBC; Fri, 23 Jul 2021 11:14:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1627038853; bh=UseeaPrqz/YFBbxCMHsqNT553PVjChHK30Ly8GJbOzY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=gcm1Bu58yDiqm7o+2RQfbtbjTVUXcnVtaXqR0WK1ItRi8qm6Rg+SLWNqFN7CgV2a/ Aefq3umIun7VbH9UACMuIL6CA1AGjlJrgZTbyv3Yq0Eygu9ySs+RL7Pqy0ZPfieMiG 1qqAEQVak1z6BeLmY3reMagLPWlqBJCr6ZecXdno= Date: Fri, 23 Jul 2021 13:14:10 +0200 From: Greg KH To: Luis Chamberlain Cc: Tejun Heo , rafael@kernel.org, davem@davemloft.net, kuba@kernel.org, ast@kernel.org, andriin@fb.com, daniel@iogearbox.net, atenart@kernel.org, alobakin@pm.me, weiwan@google.com, ap420073@gmail.com, jeyu@kernel.org, ngupta@vflare.org, sergey.senozhatsky.work@gmail.com, minchan@kernel.org, axboe@kernel.dk, mbenes@suse.com, jpoimboe@redhat.com, tglx@linutronix.de, keescook@chromium.org, jikos@kernel.org, rostedt@goodmis.org, peterz@infradead.org, linux-block@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v4] sysfs: fix kobject refcount to address races with kobject removal Message-ID: References: <20210623215007.862787-1-mcgrof@kernel.org> <20210625215558.xn4a24ts26bdyfzo@garbanzo> <20210701224816.pkzeyo4uqu3kbqdo@garbanzo> <20210722213137.jegpykf2ddwmmck5@garbanzo> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210722213137.jegpykf2ddwmmck5@garbanzo> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 22, 2021 at 02:31:37PM -0700, Luis Chamberlain wrote: > On Wed, Jul 21, 2021 at 01:30:29PM +0200, Greg KH wrote: > > On Thu, Jul 01, 2021 at 03:48:16PM -0700, Luis Chamberlain wrote: > > > On Fri, Jun 25, 2021 at 02:56:03PM -0700, Luis Chamberlain wrote: > > > > On Thu, Jun 24, 2021 at 01:09:03PM +0200, Greg KH wrote: > > > > > thanks for making this change and sticking with it! > > > > > > > > > > Oh, and with this change, does your modprobe/rmmod crazy test now work? > > > > > > > > It does but I wrote a test_syfs driver and I believe I see an issue with > > > > this. I'll debug a bit more and see what it was, and I'll then also use > > > > the driver to demo the issue more clearly, and then verification can be > > > > an easy selftest test. > > > > > > OK my conclusion based on a new selftest driver I wrote is we can drop > > > this patch safely. The selftest will cover this corner case well now. > > > > > > In short: the kernfs active reference will ensure the store operation > > > still exists. The kernfs mutex is not enough, but if the driver removes > > > the operation prior to getting the active reference, the write will just > > > fail. The deferencing inside of the sysfs operation is abstract to > > > kernfs, and while kernfs can't do anything to prevent a driver from > > > doing something stupid, it at least can ensure an open file ensure the > > > op is not removed until the operation completes. > > > > Ok, so all is good? > > It would seem to be the case. > > > Then why is your zram test code blowing up so badly? > > I checked the logs for the backtrace where the crash did happen > and we did see clear evidence of the race we feared here. The *first* > bug that happened was the CPU hotplug race: > > [132004.787099] Error: Removing state 61 which has instances left. > [132004.787124] WARNING: CPU: 17 PID: 9307 at ../kernel/cpu.c:1879 __cpuhp_remove_state_cpuslocked+0x1c4/0x1d0 I do not understand what this issue is, is it fixed? Why is a cpu being hot unplugged at the same time a zram? thanks, greg k-h