Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp19815746rwd; Wed, 28 Jun 2023 14:58:43 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6T8GNJ2xUXtj93iGVB7Qee0GFPIeAU0yMzjOchnBh1qRp4PwtQNTDaV0DrqtfW6thLwzVW X-Received: by 2002:a17:907:1b07:b0:992:9005:5ed5 with SMTP id mp7-20020a1709071b0700b0099290055ed5mr1765774ejc.32.1687989523525; Wed, 28 Jun 2023 14:58:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687989523; cv=none; d=google.com; s=arc-20160816; b=L3jppZWuMwq+TJjcFPgQ1Bdtmr+0S4CIxATrL6ionz/C9uvL79pwYdOwsDWoNA08lJ muPWB7VY0Xb1TC9l6a9/9FtXJ6Dyw3WeRM/Ei2WunJv22Mh3KKeIHjlUXoroImO6ROJj Ul8wr78y1XWIJWZPZIyMDwGuTU8d9IlI6sfSUG24Wkgkn3gZ1hfcynyelXe7uHCedCRv gOqeAt9Vn7D9tmeRnfwaCZ/wD6yebPNeM59q/wZyEebb0zVdCRjG4fzb38+8XPXuJdBQ dczLd+m3DfnEQEvDQww5wpuYhGnF2qxRYdFQiwzI8qZ0ZspIue4PZi1PRmYvYeba2xoe CBSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=oiU1ba44MUUFQMwv4R2LW1J18iVFXpq0u3lKLnpz3Xc=; fh=t7oNXBofABT1rtxwLWdK8muHO85wpImgAz2ROAq7voM=; b=lwjSqH/voa+z/aeAflS5MaDEwYDqsn/aXfC5GKYP0vGY0BN6c61AshUJxNvpbMmv39 L2Yb6VrHx9PAaq4Rj4KeYOpvnhekhdjbfrAXDhwipNcsMt7LAblmUc5fPMOIQTD0Sjkb rKSdQldQtcouaNJ/QPrVKLPln5r3FO3RadNeKXSDRNzrOXLIGzJfDoVeNWR4tzPSwukp 58ngV3U0iaDiWrvVUmx1N58JpzHIPi9OOo7fEmNbSL+9h+9P5gXjIdyzwOsBKRM9iN9l zHJvy3NMMoN0hRR9fly8xakN3yFMKQ+eafpNXakWT3Sq1kfQ1adBM13N7h1XfGMD/pi5 Rtsw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=yGSq9469; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bx14-20020a170906a1ce00b00987e40fd473si5795337ejb.1019.2023.06.28.14.58.18; Wed, 28 Jun 2023 14:58:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=yGSq9469; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231490AbjF1VvQ (ORCPT + 99 others); Wed, 28 Jun 2023 17:51:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37420 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230379AbjF1VvP (ORCPT ); Wed, 28 Jun 2023 17:51:15 -0400 Received: from mail-yb1-xb36.google.com (mail-yb1-xb36.google.com [IPv6:2607:f8b0:4864:20::b36]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C16CD1FFB for ; Wed, 28 Jun 2023 14:51:13 -0700 (PDT) Received: by mail-yb1-xb36.google.com with SMTP id 3f1490d57ef6-c2cf4e61bc6so8192276.3 for ; Wed, 28 Jun 2023 14:51:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687989073; x=1690581073; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=oiU1ba44MUUFQMwv4R2LW1J18iVFXpq0u3lKLnpz3Xc=; b=yGSq9469ZtBk1ZalIinO4oHCn6+fl9Mioak3I3ItprNlBB2i5CG8SW2uaDJYBJ82CF peruRSwPdWE3ulMVZphVZF6Td9mBMDF2emMRSAaDIr4OkPG0hzUxwu759Prh9CPgykAx XRFojr8qVtzUt0+3h9YqMTuRkgvFqqhNGoo7jyIUzTMo9Fv/tcwxK7Pf509o/7+YAM/1 tKV74JfZJ9/F1aCnWQNLG0fLkTUQjrmSCb3EA6ygUrlLtha+2T5pHZRJYeNdWPu6mZye XjSaajeSL6VnNlsmoOo7Wm/driYwl+iqhGitf5wrv0LKJ3PsUo+PeXq0IYSVgg2GDAX8 73Ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687989073; x=1690581073; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oiU1ba44MUUFQMwv4R2LW1J18iVFXpq0u3lKLnpz3Xc=; b=F2i6atAJAsbk2eHyTqB53/nQ8TU+7HReQW5MQxeCaXvKEI1pkyHGJ0EX571N4AG92o +ySdnyfHaP5i/uYyA3WgA/zbcaapMHWIQkTLjsd4VTcOnVlidLKQD4rhF4p51+BA8bxf +YhgOvbYBIWRiZcl9Vfox1ryMq7fniKSOxGRM56rEMTm7t4lOYItCADgHZiRlsvNtZSg zbESbn4FMDfqR1qtro2LBZ+Krs0iIky64cAeuESiyia2BgczQn+UccB+he+vtDYn/kYQ DLTNEzchYddsagbWfSHkHt+DrvzzaKnKeiOtGh+5k7kzN8IFsmK8OfNHTptkRsTtbbaF A4hg== X-Gm-Message-State: AC+VfDzsuGt4KFwH7dLW/xDaZonxqMohe/8KM3vBFqYp71lA1HR9aRQM HkpRDJ9Y/LfOA9ZDvQV3dXNYp8Xrid6RX5ONPX51sQ== X-Received: by 2002:a25:6b50:0:b0:b9e:6fd1:4350 with SMTP id o16-20020a256b50000000b00b9e6fd14350mr35804952ybm.17.1687989072841; Wed, 28 Jun 2023 14:51:12 -0700 (PDT) MIME-Version: 1.0 References: <20230628-meisennest-redlich-c09e79fde7f7@brauner> <20230628-faden-qualvoll-6c33b570f54c@brauner> <20230628-spotten-anzweifeln-e494d16de48a@brauner> <2023062845-stabilize-boogieman-1925@gregkh> In-Reply-To: From: Suren Baghdasaryan Date: Wed, 28 Jun 2023 14:50:59 -0700 Message-ID: Subject: Re: [PATCH 1/2] kernfs: add kernfs_ops.free operation to free resources tied to the file To: Tejun Heo Cc: Greg KH , Christian Brauner , peterz@infradead.org, lujialin4@huawei.com, lizefan.x@bytedance.com, hannes@cmpxchg.org, mingo@redhat.com, ebiggers@kernel.org, oleg@redhat.com, akpm@linux-foundation.org, viro@zeniv.linux.org.uk, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-fsdevel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 28, 2023 at 1:34=E2=80=AFPM Tejun Heo wrote: > > Hello, Suren. > > On Wed, Jun 28, 2023 at 01:12:23PM -0700, Suren Baghdasaryan wrote: > > AFAIU all other files that handle polling rely on f_op->release() > > being called after all the users are gone, therefore they can safely > > free their resources. However kernfs can call ->release() while there > > are still active users of the file. I can't use that operation for > > resource cleanup therefore I was suggesting to add a new operation > > which would be called only after the last fput() and would guarantee > > no users. Again, I'm not an expert in this, so there might be a better > > way to handle it. Please advise. > > So, w/ kernfs, the right thing to do is making sure that whatever is expo= sed > to the kernfs user is terminated on removal - ie. after kernfs_ops->relea= se > is called, the ops table should be considered dead and there shouldn't be > anything left to clean up from the kernfs user side. You can add abstract= ion > kernfs so that kernfs can terminate the calls coming down from the higher > layers on its own. That's how every other operation is handled and what > should happen with the psi polling too. I'm not sure I understand. The waitqueue head we are freeing in ->release() can be accessed asynchronously and does not require any kernfs_op call. Here is a recap of that race: do_select vfs_poll cgroup_pressure_release psi_trigger_destroy wake_up_pollfree(&t->event_wait) -> unblocks vfs_poll synchronize_rcu() kfree(t) -> frees waitqueue head poll_freewait() -> UAF Note that poll_freewait() is not part of any kernel_op, so I'm not sure how adding an abstraction kernfs would help, but again, this is new territory for me and I might be missing something. On a different note, I think there might be an easy way to fix this. What if psi triggers reuse kernfs_open_node->poll waitqueue head? Since we are overriding the ->poll() method, that waitqueue head is unused AFAIKT. And best of all, its lifecycle is tied to the file's lifecycle, so it does not have the issue that trigger waitqueue head has. In the trigger I could simply store a pointer to that waitqueue and use it. Then in ->release() freeing trigger would not affect the waitqueue at all. Does that sound sane? > > Thanks. > > -- > tejun