Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp18300558rwd; Tue, 27 Jun 2023 15:02:35 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ54Io8jOJd50nqSpTnG5z3M1KLqy492ny9tnDLjh1PYqJj3VQYmDpJKDfHdTWwUczUJa1/9 X-Received: by 2002:a17:907:1dec:b0:991:e3c4:c129 with SMTP id og44-20020a1709071dec00b00991e3c4c129mr3991476ejc.69.1687903355308; Tue, 27 Jun 2023 15:02:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687903355; cv=none; d=google.com; s=arc-20160816; b=LZV6/xfVtnaXF4t7t0CQ/EUTZT19tbaOcqdvACAfRnSPo1wyxltrB4znu4RaaPZYbu T2mAs7QQ4tqZeCRajSlNGAbQBMB7QVLPYRr0Aq9x/2BVzy88fsbDMkOZ1h5ymLfbSvLE dZTLH5xOIRpNWrTneZM54WetrStJ+1pw/ni28O1JwPh7dLAREDLppGriwk7Enl9JEIFD fQiN4i2xzhzEBGqYnOLsnTSEzI3GeKZloHKsqcE0kxtOulEm8/z28G6jUR4sz20eROPA Ra+KJ+BZYiDn5CK2aOrXsGr2hJX+2ebQCa9l7N6wW/gz8YSYzURTT00GhImrnsjMHZIZ PaMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=KJ+N2CuvmAc+wSVX5vK53Z8eetMsRLGAFF7lJpIJfks=; fh=gWOXbR8Go8ZZzSq1qnvwsWT+mN+uIrppteKEhejTBsg=; b=F1UbiCv36yJNDN6oesn2mihGW+jrYz05+ayrQMEvcICG7MXNN0XUQ1op1Ujkqb3K5c t/HbnLHODTM4b8s//Kv6+6AwhqYqn63jJxS48fuN+DC0HbsbEIIdWdamjzes0NF0ZmCp kvpfoM3NJpiEYH6CRIbd59b2e/S9A61n8cpGkrp3ohlPWYo1acBWahIWvo1ueOdcmKpV DfUrT6gMHU7I6zads7BlMVBS3mfVylPGOl5MCEqTsLDgOq95ZoFNhka0roiHn/LXoABH f2RbKN9CxGxhSD37jCwDqgFEZ94Bh6WPtA82Iu2aVFT7C7M4FyP6DYaFd+OLcCuJ+wnu kK4A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=yZmGrcWc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gj25-20020a170906e11900b0098e0fb375e9si4266773ejb.7.2023.06.27.15.02.10; Tue, 27 Jun 2023 15:02:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=yZmGrcWc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230086AbjF0V6w (ORCPT + 99 others); Tue, 27 Jun 2023 17:58:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48756 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229912AbjF0V6W (ORCPT ); Tue, 27 Jun 2023 17:58:22 -0400 Received: from mail-yb1-xb2f.google.com (mail-yb1-xb2f.google.com [IPv6:2607:f8b0:4864:20::b2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 52AF110D8 for ; Tue, 27 Jun 2023 14:58:20 -0700 (PDT) Received: by mail-yb1-xb2f.google.com with SMTP id 3f1490d57ef6-c13280dfb09so4056224276.2 for ; Tue, 27 Jun 2023 14:58:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687903099; x=1690495099; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=KJ+N2CuvmAc+wSVX5vK53Z8eetMsRLGAFF7lJpIJfks=; b=yZmGrcWcj5pZWgnmUzy8P0MNqCMof88icxobA86fP/l45vue+mRRI6eWcXK63Ef3Go FTI//QnmS1jCsU1O7a31qy/2/oGu88vMSOXH8iw0oXdmRDwQZWkmqSaSJPov/40JcKyV R8UEG143N5iPzajAx0cfIecyu4OYnohi1A8XXcKOiUgXMYNd0LCmG3b+4UHogC3kUBjZ 5nAwVTsD2r92uhw3P9j/uCFa5F4nh8POi4STatDMKBzfObiIAE0j3BT79Kz+R4dhmsYE ZhpOfEfuAHped4zA05Qz+Jbp+x7TpwvW1OjM67VFetV3y+OphL91lSriK/CmLXfG7f9i FkIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687903099; x=1690495099; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KJ+N2CuvmAc+wSVX5vK53Z8eetMsRLGAFF7lJpIJfks=; b=Y/dtLE1Q99Bp9xLsrO3rQi3H/R5fjczcqbn1DOnW/8zyr4mRY/ddKby9l4DkuSZeHM pip+DkqrL7WkDSxrg9sEhgnR353wFGI6EczCLFGEZ6HXinmrMjfXF+I7T7MPCcyxpq3S u5O0GV3QE3HnVRL2csEuETXtlGu9VMoz9n0xr6biiukNHvu+qBHWyThOdx4W2kdF/vpn lMKa+mRyz6KpiypMl96PMk5vRrrqIHmqysrqlaXu91+QtcATfKfxzyhj6o3fMwycIyz0 E+p0IFfKkCPD7ZXZRIJmG+pmmWQZMpjSEgca/2f9+oV9PZv2NMbaMKGNWsbn+dggAt/G LgXQ== X-Gm-Message-State: AC+VfDxLBzsuud1uZLDVaOT35gq74E4AJgFuhSBsEZeD2owmHzVH7uGV Y8E4gSxWIQEBJX/oLi/kU2eTuN+NvoEnsAUOfHxD4Q== X-Received: by 2002:a25:a4aa:0:b0:c00:e6c4:1812 with SMTP id g39-20020a25a4aa000000b00c00e6c41812mr16841209ybi.63.1687903099324; Tue, 27 Jun 2023 14:58:19 -0700 (PDT) MIME-Version: 1.0 References: <20230626201713.1204982-1-surenb@google.com> <20230627-kanon-hievt-bfdb583ddaa6@brauner> <20230627-ausgaben-brauhaus-a33e292558d8@brauner> In-Reply-To: From: Suren Baghdasaryan Date: Tue, 27 Jun 2023 14:58:08 -0700 Message-ID: Subject: Re: [PATCH 1/2] kernfs: add kernfs_ops.free operation to free resources tied to the file To: Tejun Heo Cc: Christian Brauner , gregkh@linuxfoundation.org, peterz@infradead.org, lujialin4@huawei.com, lizefan.x@bytedance.com, hannes@cmpxchg.org, mingo@redhat.com, ebiggers@kernel.org, oleg@redhat.com, akpm@linux-foundation.org, viro@zeniv.linux.org.uk, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-fsdevel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 27, 2023 at 2:43=E2=80=AFPM Suren Baghdasaryan wrote: > > On Tue, Jun 27, 2023 at 1:09=E2=80=AFPM Suren Baghdasaryan wrote: > > > > On Tue, Jun 27, 2023 at 11:42=E2=80=AFAM Tejun Heo wrot= e: > > > > > > Hello, Christian. > > > > > > On Tue, Jun 27, 2023 at 07:30:26PM +0200, Christian Brauner wrote: > > > ... > > > > ->release() was added in > > > > > > > > commit 0e67db2f9fe91937e798e3d7d22c50a8438187e1 > > > > kernfs: add kernfs_ops->open/release() callbacks > > > > > > > > Add ->open/release() methods to kernfs_ops. ->open() is called= when > > > > the file is opened and ->release() when the file is either rele= ased or > > > > severed. These callbacks can be used, for example, to manage > > > > persistent caching objects over multiple seq_file iterations. > > > > > > > > Signed-off-by: Tejun Heo > > > > Acked-by: Greg Kroah-Hartman > > > > Acked-by: Acked-by: Zefan Li > > > > > > > > which mentions "either releases or severed" which imho already poin= ts to > > > > separate methods. > > > > > > This is because kernfs has revoking operation which doesn't exist for= other > > > filesystems. Other filesystem implemenations can't just say "I'm done= . Bye!" > > > and go away. Even if the underlying filesystem has completely failed,= the > > > code still has to remain attached and keep aborting operations. > > > > > > However, kernfs serves as the midlayer to a lot of device drivers and= other > > > internal subsystems and it'd be really inconvenient for each of them = to have > > > to implement "I want to go away but I gotta wait out this user who's = holding > > > onto my tuning knob file". So, kernfs exposes a revoke or severing se= mantics > > > something that's exposing interface through kernfs wants to stop doin= g so. > > > > > > If you look at it from file operation implementation POV, this seems = exactly > > > like ->release. All open files are shutdown and there won't be any fu= ture > > > operations. After all, revoke is forced closing of all fd's. So, for = most > > > users, treating severing just like ->release is the right thing to do= . > > > > > > The PSI file which caused this is a special case because it attaches > > > something to its kernfs file which outlives the severing operation by= passing > > > kernfs infra. A more complete way to fix this would be supporting the > > > required behavior from kernfs side, so that the PSI file operates on = kernfs > > > interface which knows the severing event and detaches properly. That = said, > > > currently, this is very much an one-off. > > > > > > Suren, if you're interested, it might make sense to pipe poll through= kernfs > > > properly so that it has its kernfs operation and kernfs can sever it.= That > > > said, as this is a fix for something which is currently causing crash= es, > > > it'd be better to merge this simpler fix first no matter what. > > > > I'm happy to implement the right fix if you go into more details. > > AFAIKT kernfs_ops already has poll() operation, we are hooking > > cgroup_file_poll() to it and using kernfs_generic_poll(). I thought > > this is the right way to pipe poll through kernfs but if that's > > incorrect, please let me know. I'm happy to fix that. > > Ah, sorry, for PSI we are not using kernfs_generic_poll(), so my claim > misrepresents the situation. Let me look into how > kernfs_generic_poll() is implemented and maybe I can find a better > solution for PSI. Ok in kernfs_generic_poll() we are using kernfs_open_node.poll waitqueue head for polling and kernfs_open_node is freed from inside kernfs_unlink_open_file() which is called from kernfs_fop_release(). So, it is destroyed only when the last fput() is done, unlike the ops->release() operation which we are using for destroying PSI trigger's waitqueue. So, it seems we still need an operation which would indicate that the file is truly going away. Christian's suggestion to rename current ops->release() operation into ops->drain() (or ops->flush() per Matthew's request) and introduce a "new" ops->release() which is called only when the last fput() is done seems sane to me. Would everyone be happy with that approach? > Thanks, > Suren. > > > Thanks, > > Suren. > > > > > > > > Thanks. > > > > > > > > -- > > > tejun