Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp21553669rwd; Thu, 29 Jun 2023 18:31:16 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4b84CyHrBLO8i7Zd/oLs3xEEzsDsIwA4a4NbRmX8X7jeFI3p8QPxS1lmHbMCGEoGZwB3W/ X-Received: by 2002:a05:6a20:be1c:b0:12c:b0be:af79 with SMTP id ge28-20020a056a20be1c00b0012cb0beaf79mr1133122pzb.33.1688088676353; Thu, 29 Jun 2023 18:31:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688088676; cv=none; d=google.com; s=arc-20160816; b=choHl47wC6ViuD2z9/yMonzY0EM2lDCPFSBkcQpXLuSefHtOfRkq21los4I1eEFZRf 0nW5PbEO81zpU27NL9N12dzKZG9yoyCDjB2oF4YU4IQ8WqdXAq7sl9dkdy4NGcR8QpsA fT6oPaEZCbVcZgMqvVf+MXuXjTLr7tAdx9NgnaHUOMRKrCJNqXyYMRNKc7ruWnALjL4J mYnSoWIgZDZKwwH1R45sFDKDH6RPG8S6hwtGDlb+/PaKHl4/Twdf0H9i3gFPxbxuHSmV Ksjz7Vw2QzEVBPeGDWYU0OMisTivIlz/xjfrVXFWhCli5jY+Xq8RECH68jyYZOsiCiQq F2tw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=1fMmXmW0bN/OZjvRrmhsMCE2AbjOlvmuvYsM+l0kwu0=; fh=t7oNXBofABT1rtxwLWdK8muHO85wpImgAz2ROAq7voM=; b=rvP4JvhzZFmQFKhdXxMtSUdgOT2Ycrus8SsJzWnYEW36eAOMfrBu6A3ulSk403kYXt nPZYsdVCi7DcPo6nVwjmBVw/oIQVVakpbvOVG2e5861+l5UIyAE4XZ90N4Rum/ITDcaB d6WyXIElb1pdTw+f1U3vrksOsNXsXTpw45OT+KG7dcs+L8zb3DiF7lCfTtPjSlbjcaM3 wgAAqZfjurVbBEoA8nuMOBpvCZ5MNwIy4/SmWA9XuzHquk7N3gVeCE0Dh5LW+x/jYPEo 8dZfer9o1uRjHHPuv58nc9C6hRhFxagSQqHCARn3Jv8iLtiTbI3PK4vmvUbV5WgTsTDK +y8g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=5Yi2t8UJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p18-20020a170902ebd200b001ab1d1a6802si11098833plg.1.2023.06.29.18.31.01; Thu, 29 Jun 2023 18:31:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=5Yi2t8UJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231857AbjF3A7X (ORCPT + 99 others); Thu, 29 Jun 2023 20:59:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36878 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231176AbjF3A7V (ORCPT ); Thu, 29 Jun 2023 20:59:21 -0400 Received: from mail-yb1-xb29.google.com (mail-yb1-xb29.google.com [IPv6:2607:f8b0:4864:20::b29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0F05E2D4C for ; Thu, 29 Jun 2023 17:59:20 -0700 (PDT) Received: by mail-yb1-xb29.google.com with SMTP id 3f1490d57ef6-c40c367949eso288308276.1 for ; Thu, 29 Jun 2023 17:59:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688086759; x=1690678759; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=1fMmXmW0bN/OZjvRrmhsMCE2AbjOlvmuvYsM+l0kwu0=; b=5Yi2t8UJnMuttj4mjP+RoihhUwYozeForLZddQeb2UGwZ4aZLqwDAnBkQ2O5n2B0Zv 61TlwP3Z99zgdWvNHZyETBcOeT3SyQ6kCVyYsQbM00SL9PlT3Q2S5o+662/8pAw6idgc pWHhWwpTqsq8yIFzvMFy2UGDaFtfMn5QkoirnFaZNY6y/qm2vEHHadkn87oEW7p/MPlb 6YmCwGN6A3n7BFHzxQQBbfm4olW/PoxEvACOAoRsizcycVGGKB6B1iDCla//58lwQf6z 2+r6N6E1D8AVAuDMne+dEPeEnp26bFTvdgLcs4zKTS7hwRGkHzSEfrzwrBMzZoryhxPi X3KA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688086759; x=1690678759; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1fMmXmW0bN/OZjvRrmhsMCE2AbjOlvmuvYsM+l0kwu0=; b=Y8Fc39+aqU1KZBAz2wTQIqusCcBgkrsRgWCxZnc3sATiF1r/UmPX+l6YTAkWMCq6QA mUMPwCuXg1Uhlp6zvYuEiYYFxIZFQ/trHwz2v5AQxWTc/exnccqHmOvahaNc0OT8SDi2 TKThy7pOnw35k2cKVSfFCOsIqaNAhTGnEnQC8O30OCRrEXI5U43bJJAOcwV5nxZBM0dI SYhFI+ur147A9Re7gyRIzBfw24zaU+wF0wq5o4v0eigqGddtKAyNbDoWOAfBTiRYLvXO c42t+qDixleyGtZgsNFAPrsYtR/zsXareGRTaYSHZZDiDmeTPceCqwABqpXMmiAt5rfZ Ls4Q== X-Gm-Message-State: ABy/qLalaWe3pVmpYn1/EZAc7vbMW3+3JSDKGGiUuvda14r9aDNHVAmz sgChOtF9OPOe3nGOa2FigaYtyJDRcYnNrBCHoasBFQ== X-Received: by 2002:a25:9e87:0:b0:bcc:571d:a300 with SMTP id p7-20020a259e87000000b00bcc571da300mr1221336ybq.20.1688086759059; Thu, 29 Jun 2023 17:59:19 -0700 (PDT) MIME-Version: 1.0 References: <20230628-meisennest-redlich-c09e79fde7f7@brauner> <20230628-faden-qualvoll-6c33b570f54c@brauner> <20230628-spotten-anzweifeln-e494d16de48a@brauner> <2023062845-stabilize-boogieman-1925@gregkh> In-Reply-To: From: Suren Baghdasaryan Date: Thu, 29 Jun 2023 17:59:07 -0700 Message-ID: Subject: Re: [PATCH 1/2] kernfs: add kernfs_ops.free operation to free resources tied to the file To: Tejun Heo Cc: Greg KH , Christian Brauner , peterz@infradead.org, lujialin4@huawei.com, lizefan.x@bytedance.com, hannes@cmpxchg.org, mingo@redhat.com, ebiggers@kernel.org, oleg@redhat.com, akpm@linux-foundation.org, viro@zeniv.linux.org.uk, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-fsdevel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 28, 2023 at 2:50=E2=80=AFPM Suren Baghdasaryan wrote: > > On Wed, Jun 28, 2023 at 1:34=E2=80=AFPM Tejun Heo wrote: > > > > Hello, Suren. > > > > On Wed, Jun 28, 2023 at 01:12:23PM -0700, Suren Baghdasaryan wrote: > > > AFAIU all other files that handle polling rely on f_op->release() > > > being called after all the users are gone, therefore they can safely > > > free their resources. However kernfs can call ->release() while there > > > are still active users of the file. I can't use that operation for > > > resource cleanup therefore I was suggesting to add a new operation > > > which would be called only after the last fput() and would guarantee > > > no users. Again, I'm not an expert in this, so there might be a bette= r > > > way to handle it. Please advise. > > > > So, w/ kernfs, the right thing to do is making sure that whatever is ex= posed > > to the kernfs user is terminated on removal - ie. after kernfs_ops->rel= ease > > is called, the ops table should be considered dead and there shouldn't = be > > anything left to clean up from the kernfs user side. You can add abstra= ction > > kernfs so that kernfs can terminate the calls coming down from the high= er > > layers on its own. That's how every other operation is handled and what > > should happen with the psi polling too. > > I'm not sure I understand. The waitqueue head we are freeing in > ->release() can be accessed asynchronously and does not require any > kernfs_op call. Here is a recap of that race: > > do_select > vfs_poll > cgroup_pressure_release > psi_trigger_destroy > wake_up_pollfree(&t->event_wait) -> unblocks vfs_poll > synchronize_rcu() > kfree(t) -> frees waitqueue head > poll_freewait() -> U= AF > > Note that poll_freewait() is not part of any kernel_op, so I'm not > sure how adding an abstraction kernfs would help, but again, this is > new territory for me and I might be missing something. > > On a different note, I think there might be an easy way to fix this. > What if psi triggers reuse kernfs_open_node->poll waitqueue head? > Since we are overriding the ->poll() method, that waitqueue head is > unused AFAIKT. And best of all, its lifecycle is tied to the file's > lifecycle, so it does not have the issue that trigger waitqueue head > has. In the trigger I could simply store a pointer to that waitqueue > and use it. Then in ->release() freeing trigger would not affect the > waitqueue at all. Does that sound sane? I think this approach is much cleaner and I'm guessing that's in line with what Tejun was describing (maybe it's exactly what he was telling me but it took time for me to get it). Posted the patch implementing this approach here: https://lore.kernel.org/all/20230630005612.1014540-1-surenb@google.com/ > > > > > > Thanks. > > > > -- > > tejun