Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp3816411rdb; Thu, 14 Sep 2023 03:43:19 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHkrmaXI7BUIhc6ISEnPdFBAHdPNhOqqQhsevA6n93ZGnqOzKhs5ga31t1FPivINGBy9wZW X-Received: by 2002:a17:902:d491:b0:1b8:76ce:9dab with SMTP id c17-20020a170902d49100b001b876ce9dabmr6170816plg.41.1694688198788; Thu, 14 Sep 2023 03:43:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694688198; cv=none; d=google.com; s=arc-20160816; b=L0eFilCcQ2su9V3FVCCAv6qUTQFDqw6oOJ052sfVcxEAMQAcJbV1zDNMd5VOCjFcHR X/7MAxBgImC9E0woptFs7yCOsuzwkbzuf1QTwrR8u1wYs+6hRGwsXCEN3TN1Q2SvNumj gJD1wvJnJ+E+7tUptn0TrtGzj43jKZcz6Q1m9zeffhg3pDqw/VL5izlpS9Bil3DkD3Wg XSSgu24kBm88L2aHGZecVdb4xvbZOIPGNM3b+aiWVoM/YZmvZOIFGbAUQQFauobdE19b +IoSFSOtEJBXjeNBZ8UaobOXewZYQ92ujSCUq2MaiC0iacUq+hHrSSq/x7GbrzMCHpfL CVCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=RmUYqF8wwggq/9y9Fc7orMwmqO8RbpvM81PaCv8SZQM=; fh=ggAtyYCLjQC+aEaWV6nS5QgUNtsrjVB3JQO9PZCF1so=; b=vqwNtoTP6PoKRpMy5m4ur0PlUuEmPUaZqgSwAggSSQMbhaDkASCqtQRA6xI+b07nsS jZJBh8GlBYe+jV31pcMetKZ86HKuBC/WGYZzJadPZS0a7tBulR8sBTz3HsmHz3XLiEu6 45KAV1AG0XJatmFMX6IyVedw/IA23XSQp8+ojDRJBQFOQ050upXip+G1VxAQy40iHOWC ao5+VIAj2zFTVHFEWDl9RRhpj3gV8pXqll2OaMFDm8M7ShDJCH6CKXobc7hyokZ7FEyA xE6Vz1mIA+A0AWq1xxlefqrAEmTfC49wkElMUqvTlD5rVmWbxtuZH0V0DjMX2JLezeis 6tlw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.org.uk header.s=zeniv-20220401 header.b=C6Dr0C1U; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zeniv.linux.org.uk Return-Path: Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id kx16-20020a170902f95000b001bbad1883d5si1328757plb.293.2023.09.14.03.43.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 03:43:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.org.uk header.s=zeniv-20220401 header.b=C6Dr0C1U; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zeniv.linux.org.uk Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 315C28260C13; Wed, 13 Sep 2023 22:39:45 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233468AbjINFj2 (ORCPT + 99 others); Thu, 14 Sep 2023 01:39:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50400 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233128AbjINFj2 (ORCPT ); Thu, 14 Sep 2023 01:39:28 -0400 Received: from zeniv.linux.org.uk (zeniv.linux.org.uk [IPv6:2a03:a000:7:0:5054:ff:fe1c:15ff]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 218CF8E; Wed, 13 Sep 2023 22:39:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=RmUYqF8wwggq/9y9Fc7orMwmqO8RbpvM81PaCv8SZQM=; b=C6Dr0C1U8pj+4fMEBibEuICDhE EEkjg/MijmAsizo0zcioXNtd4W36c9mdnd0D2zVFTw1TtapnOBvqUb0vSnDBFJrCmDfpvOcOHONne cbQFyqeH9CCEfvuTIGraM2Hn1ukGaDmyEoVKqTlKg3th4e10i2SpPoAiTy38YK2trc+CoNIrPrQV4 K0oeDAvm57/3q8xZeBv5OlwDsHVPFMz59m8XSYjqo9e4HqV33cx972kX4mV9uO3c30QWCKQcVDJG8 jsGCjoM8olig+SQBNMkZp5uKci8dDFRUTLEne7SsmOttj+ApgnNyMCXaO1Qg8iqGCwTMtlWa9kvki ClVVewHA==; Received: from viro by zeniv.linux.org.uk with local (Exim 4.96 #2 (Red Hat Linux)) id 1qgf3r-005vkH-2N; Thu, 14 Sep 2023 05:38:44 +0000 Date: Thu, 14 Sep 2023 06:38:43 +0100 From: Al Viro To: Christoph Hellwig Cc: Christian Brauner , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Fenghua Yu , Reinette Chatre , Miquel Raynal , Richard Weinberger , Vignesh Raghavendra , Dennis Dalessandro , Tejun Heo , Trond Myklebust , Anna Schumaker , Kees Cook , Damien Le Moal , Naohiro Aota , Greg Kroah-Hartman , linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org, linux-hardening@vger.kernel.org, cgroups@vger.kernel.org Subject: Re: [PATCH 03/19] fs: release anon dev_t in deactivate_locked_super Message-ID: <20230914053843.GI800259@ZenIV> References: <20230913111013.77623-1-hch@lst.de> <20230913111013.77623-4-hch@lst.de> <20230913232712.GC800259@ZenIV> <20230914023705.GH800259@ZenIV> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230914023705.GH800259@ZenIV> Sender: Al Viro Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Wed, 13 Sep 2023 22:39:45 -0700 (PDT) X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email On Thu, Sep 14, 2023 at 03:37:05AM +0100, Al Viro wrote: > On Thu, Sep 14, 2023 at 12:27:12AM +0100, Al Viro wrote: > > On Wed, Sep 13, 2023 at 08:09:57AM -0300, Christoph Hellwig wrote: > > > Releasing an anon dev_t is a very common thing when freeing a > > > super_block, as that's done for basically any not block based file > > > system (modulo the odd mtd special case). So instead of requiring > > > a special ->kill_sb helper and a lot of boilerplate in more complicated > > > file systems, just release the anon dev_t in deactivate_locked_super if > > > the super_block was using one. > > > > > > As the freeing is done after the main call to kill_super_notify, this > > > removes the need for having two slightly different call sites for it. > > > > Huh? At this stage in your series freeing is still in ->kill_sb() > > instances, after the calls of kill_anon_super() you've turned into > > the calls of generic_shutdown_super(). > > > > You do split it off into a separate method later in the series, but > > at this point you are reopening the same UAF that had been dealt with > > in dc3216b14160 "super: ensure valid info". > > > > Either move the introduction of ->free_sb() before that one, or > > split it into lifting put_anon_bdev() (left here) and getting rid > > of kill_anon_super() (after ->free_sb() introduction). > > Actually, looking at the final stage in the series, you still have > kill_super_notify() done *AFTER* ->free_sb() call. So the problem > persists until the very end... It's worse - look at the rationale for 2c18a63b760a "super: wait until we passed kill super". Basically, "don't remove from the lists until after block device closing". IOW, we have * stuff that needs to be done before generic_shutdown_super() (things like pinned dentries on ramfs, etc.) * generic_shutdown_super() itself (dentry/inode eviction, optionally ->put_super()) * stuff that needs to be done before eviction from the lists (block device closing, since 2c18a63b760a) * eviction from the lists * stuff that needs to be done *after* eviction from the lists. BTW, this part of commit message in 2c18a63b760a is rather confused: Recent rework moved block device closing out of sb->put_super() and into sb->kill_sb() to avoid deadlocks as s_umount is held in put_super() and blkdev_put() can end up taking s_umount again. That was *NOT* what a recent rework had done. Block device closing had never been inside ->put_super() - at no point since that (closing, that is) had been introduced back in 0.97 ;-) ->put_super() predates it (0.95c+). The race is real, but the cause is not some kind of move of blkdev_put(). Your 2ea6f68932f7 "fs: use the super_block as holder when mounting file systems" is where it actually came from. Christoph, could you explain what the hell do we need that for? It does create the race in question and AFAICS 2c18a63b760a (and followups trying to plug holes in it) had been nothing but headache. Old logics: if mount attempt with a different fs type happens, -EBUSY is precisely corrent - we would've gotten just that if mount() came before umount(). If the type matches, we might 1) come before deactivate_locked_super() by umount(2). No problem, we succeed. 2) come after the beginning of shutdown, but before the removal from the list; fine, we'll wait for the sucker to be unlocked (which happens in the end of generic_shutdown_super()), notice it's dead and create a new superblock. Since the only part left on the umount side is closing the device, we are just fine. 3) come after the removal from the list. So we won't wait for the old superblock to be unlocked, other than that it's exactly the same as (2). It doesn't matter whether we open the device before or after close by umount - same owner anyway, no -EBUSY. Your "owner shall be the superblock" breaks that... If you want to mess with _three_-way split of ->kill_sb(), please start with writing down the rules re what should go into each of those parts; such writeup should go into Documentation/filesystems/porting anyway, even if the split is a two-way one, BTW.