Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp1180159pxb; Thu, 24 Mar 2022 14:30:03 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzFK0QyU4ogS9CthA09bZ5Wjr05ATDcr04KrINSodwG6h+8pfAyMMDBoVWyv+1DRLuhSs0W X-Received: by 2002:a50:a41a:0:b0:419:d2b:8391 with SMTP id u26-20020a50a41a000000b004190d2b8391mr9123552edb.390.1648157403781; Thu, 24 Mar 2022 14:30:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648157403; cv=none; d=google.com; s=arc-20160816; b=WL71vdVmS+3JXclgeONYgnlmxEHrkojlnVUFXSn8FYHDzv7TQVjNw/bzF6sG9Qx47d aW9e/PMd8OIpvwQLiQK6SJPuaUMBHXojWDPMGmJoejoPsSJtPK35kCfGRLMW+kWJMogq S3RY4x+7CSZDK7sByVNtb/oXDSbc8ALEpLncakd3KUK2b8K7w/ffHHpPycX/aw5zESxI h9SFgp1K/vCj1WfL2WeE87YI2XLI0P9zXUjxQpUYz7Na5MclbkEOfNFRzCF0FpzP1ueX MmC41OuC/TVKfZbuyrfRlEAZEXMK0B5QSLipgUH4/nGhrhSfxtW7GU4pgbkkTC1xyyK/ pEqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=YzVkf2bMLVb5oPbbqgjYafUq/Wuk/BVH71N4W2jdJzA=; b=jlJnVyBBbx70+0/FdZmHgatll388CJa/Z8Dptud8pNY7rwnkGF8NuLAKsmNFpnPvlD Px5q2J3P6oVzRqn/rQVYZ/oWwFiaKdDmnh/R8WI30L1ZN/rvcDprKVQLJrMiurmwZ8GE mtKKwakAWYSnTTWaak8pMpph0ZB4HFfrvMB1e7ypQMq89INyJclfsXbMYQVAMGmZ0zWA 5+Yz6GS20LgMMK/njRUpQDQvWvBQsdlFiPKCHVqKPfpY2lwMn4nPzweujxBoIvkIdaZG qzDQphJ7WB3YZjw0X/5erLzPaKExrX8yf+lxTuRu5icWiAoZ+FGmuntgPPz9A5oSysEP mFuw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id if5-20020a170906df4500b006e01646cb32si491330ejc.557.2022.03.24.14.29.37; Thu, 24 Mar 2022 14:30:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232727AbiCVU23 (ORCPT + 99 others); Tue, 22 Mar 2022 16:28:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229662AbiCVU21 (ORCPT ); Tue, 22 Mar 2022 16:28:27 -0400 Received: from zeniv-ca.linux.org.uk (zeniv-ca.linux.org.uk [IPv6:2607:5300:60:148a::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E9EB6661B for ; Tue, 22 Mar 2022 13:26:57 -0700 (PDT) Received: from viro by zeniv-ca.linux.org.uk with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1nWl5i-00Faic-Ar; Tue, 22 Mar 2022 20:26:54 +0000 Date: Tue, 22 Mar 2022 20:26:54 +0000 From: Al Viro To: Tejun Heo Cc: Imran Khan , gregkh@linuxfoundation.org, akpm@linux-foundation.org, linux-kernel@vger.kernel.org Subject: Re: [RESEND PATCH v7 7/8] kernfs: Replace per-fs rwsem with hashed rwsems. Message-ID: References: <20220317072612.163143-1-imran.f.khan@oracle.com> <20220317072612.163143-8-imran.f.khan@oracle.com> <536f2392-45d2-2f43-5e9d-01ef50e33126@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: Al Viro X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 22, 2022 at 07:08:58AM -1000, Tejun Heo wrote: > > That's interesting... My impression had been that some of these functions > > could be called from interrupt contexts (judging by the spin_lock_irqsave() > > in there). What kind of async contexts those are, and what do you use to > > make sure they don't leak into overlap with kernfs_remove()? > > The spin_lock_irqsave()'s are there because they're often used when printing > messages which can happen from any context. e.g. cpuset ends up calling into > them to print current's cgroup under rcu_read_lock(), iocost to print > warning message under an irq-safe lock. In both and similar cases, the > caller knows that the cgroup is accessible which in turn guarantees that the > kernfs node hasn't be deleted. Wait a sec. Choice of spin_lock_irqsave() vs. spin_lock_irq() is affected by having it called with interrupts disabled; choice of either vs. spin_lock() is not - that's needed only if you might end up taking the spinlock in question from interrupt handler. "Under rcu_read_lock()" is irrelevant here... The point of spin_lock_irq/spin_lock_irqsave is the prevention of spin_lock(&LOCK); // locked take an interrupt, enter interrupt handler and there run into spin_lock(&LOCK); // and we spin forever If there's no users in interrupt contexts, we are just fine with plain spin_lock(). The only thing that matter wrt rcu_read_lock() is that we can't block there; there are tons of plain spin_lock() calls done in those conditions. And rcu_read_lock() doesn't disable interrupts, so spin_lock_irq() is usable under it. Now, holding another spinlock with spin_lock_irq{,save}() *does* prohibit the use of spin_lock_irq() - there you can use only spin_lock() or spin_lock_irqsave(). The callchains that prohibit spin_lock() do exist - for example, there's pr_cont_kernfs_path <- pr_cont_cgroup_path <- transfer_surpluses <- ioc_timer_fn. Out of curiosity, what guarantees that kernfs_remove() won't do fun things to ancestors of iocg_to_blkg(iocg)->blkcg->css.cgroup for some iocg in ioc->active_iocgs, until after ioc_rqos_exit(ioc) has finished del_timer_sync()?