Received: by 2002:a05:7412:da14:b0:e2:908c:2ebd with SMTP id fe20csp2157590rdb; Mon, 9 Oct 2023 14:58:28 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGPJepEyZ8Z57HMfyqHIAPppQWffLeFzPshufnG2lm/Fx1ZxUVRwZMDnwwef06KFyzbJT+r X-Received: by 2002:a17:90a:ec06:b0:269:5adb:993 with SMTP id l6-20020a17090aec0600b002695adb0993mr12103802pjy.22.1696888707736; Mon, 09 Oct 2023 14:58:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696888707; cv=none; d=google.com; s=arc-20160816; b=UzKw5Z1vBfDGCILMnmfnpyrnzPEQMbVkrTzqpxJ4YEbAsa85XlpTaEMN70QYQ5LdL3 NWXqgcV5QZ5K/fbe2KShE1YIToTYgtbS8tVdOr7brrTuF3lvEOcjUmZQqpamiD/PhAuw ogfbtUuKFKujd1trz8/kMs5UWvbZOj6UFNXAVrEQEGh8tEnUWmRIRQ+JUFkYDIhi8MzK PllCiUqIuLjCCh035xqt2mwgjMQxA87+9rhXXTlxaZ33UhQE4a6h4j7T5XpQ+xakhxJd 3IwaBelS66uhci4g4HD4Pkylz3wPvl+r2pmChE3xmhODQ+pVq2hCY2kIzIQ3Pg+UJgJo IaQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=Uo1EiyojZdAoskTvMvtwCcjNoSWtdzYtyhDVJ0XRCLA=; fh=ggAtyYCLjQC+aEaWV6nS5QgUNtsrjVB3JQO9PZCF1so=; b=TBWGdczXyQ3CQZM4JnWZHholzdfDqTv7gyyLS2lX0x834hIiSmOcZk0F/jwvEHOyrd MamfQcd7se4y0ns140GxmjPR+dqbktA53A4qamirh9nABWSpKlmgM3MB20OeO33QBCzA T1ckw0r2GYfwVWGVjztW/wsXH91vlkeeBbHz+jl+FpYsWKZ2zB7CiN/PNLniGAlCB9bX pZPU6irOBocNv+dE2OEG7ircVvzMrd17i6nre8OQCE4hUZqXjQOSYXoK+sVRHg47LopA Js0NzAwElK/HW7B+8hPxTStrEsUznng4GVsA5EBmZUdEyXHG6wzSsV4WlZa9wBk3BvRs XhLA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.org.uk header.s=zeniv-20220401 header.b=tQPA9yIq; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zeniv.linux.org.uk Return-Path: Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id om9-20020a17090b3a8900b002777ce4c9e6si11393639pjb.72.2023.10.09.14.58.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Oct 2023 14:58:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.org.uk header.s=zeniv-20220401 header.b=tQPA9yIq; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zeniv.linux.org.uk Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 51891802C92B; Mon, 9 Oct 2023 14:58:22 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378824AbjJIV6V (ORCPT + 99 others); Mon, 9 Oct 2023 17:58:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60822 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1378789AbjJIV6U (ORCPT ); Mon, 9 Oct 2023 17:58:20 -0400 Received: from zeniv.linux.org.uk (zeniv.linux.org.uk [IPv6:2a03:a000:7:0:5054:ff:fe1c:15ff]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A2B5499; Mon, 9 Oct 2023 14:58:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=Uo1EiyojZdAoskTvMvtwCcjNoSWtdzYtyhDVJ0XRCLA=; b=tQPA9yIqHmqNSbhOVvxiucllWR l2tso1BLwh5CGWav21hSNoCV18PYO+IsxWdqSb/pXzM2cPknMXd9P7ke0zHb6UliQGYi6gLyRpPKv U7hiFaveQ7mceuNDQ8jXcyNC9Wn3AkdirOyK8p0NOTO/Oj6yURkgBAyy5W9y7bWeLSH/O6DDIkmyl WCp+Xw1NcUBfI4VrenMtUjHlxEHqjrSYPedcuoxv/CrT5Nzph7SfGqANRPJyxErt/tMNNVW/JgwXK fM0gVUEdTnwkyob+53dTu2ypbGyfYDxKYT5I9ypDKvQN6UcsFeM5PE4elD0hso579aRXCVJbRT5mn T7zz5aAw==; Received: from viro by zeniv.linux.org.uk with local (Exim 4.96 #2 (Red Hat Linux)) id 1qpyGA-00HK0m-1f; Mon, 09 Oct 2023 21:57:54 +0000 Date: Mon, 9 Oct 2023 22:57:54 +0100 From: Al Viro To: Christoph Hellwig Cc: Christian Brauner , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Fenghua Yu , Reinette Chatre , Miquel Raynal , Richard Weinberger , Vignesh Raghavendra , Dennis Dalessandro , Tejun Heo , Trond Myklebust , Anna Schumaker , Kees Cook , Damien Le Moal , Naohiro Aota , Greg Kroah-Hartman , linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org, linux-hardening@vger.kernel.org, cgroups@vger.kernel.org Subject: Re: [PATCH 03/19] fs: release anon dev_t in deactivate_locked_super Message-ID: <20231009215754.GL800259@ZenIV> References: <20230913111013.77623-1-hch@lst.de> <20230913111013.77623-4-hch@lst.de> <20230913232712.GC800259@ZenIV> <20230926093834.GB13806@lst.de> <20230926212515.GN800259@ZenIV> <20231002064646.GA1799@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231002064646.GA1799@lst.de> Sender: Al Viro X-Spam-Status: No, score=2.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_SBL_CSS, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Mon, 09 Oct 2023 14:58:22 -0700 (PDT) X-Spam-Level: ** On Mon, Oct 02, 2023 at 08:46:46AM +0200, Christoph Hellwig wrote: > On Tue, Sep 26, 2023 at 10:25:15PM +0100, Al Viro wrote: > > Before your patch: foo_kill_super() calls kill_anon_super(), > > which calls kill_super_notify(), which removes the sucker from > > the list, then frees ->s_fs_info. After your patch: > > removal from the lists happens via the call of kill_super_notify() > > *after* both of your methods had been called, while freeing > > ->s_fs_info happens from the method call. IOW, you've restored > > the situation prior to "super: ensure valid info". The whole > > point of that commit had been to make sure that we have nothing > > in the lists with ->s_fs_info pointing to a freed object. > > > > It's not about free_anon_bdev(); that part is fine - it's the > > "we can drop the weird second call site of kill_super_notify()" > > thing that is broken. > > The point has been to only release the anon dev_t after > kill_super_notify, to prevent two of them beeing reused. > > Which we do as the free_anon_bdev is done directly in > deactivate_locked_super. The new ->free_sb for non-block file systems > frees resources, but none of them matter for sget. We keep talking past each other... Let me try again: at the tip of your branch you have static struct file_system_type ubifs_fs_type = { .name = "ubifs", .owner = THIS_MODULE, .mount = ubifs_mount, .free_sb = ubifs_free_sb, }; static void ubifs_free_sb(struct super_block *s) { kfree(s->s_fs_info); } static struct dentry *ubifs_mount(struct file_system_type *fs_type, int flags, const char *name, void *data) { ... sb = sget(fs_type, sb_test, sb_set, flags, c); ... } static int sb_test(struct super_block *sb, void *data) { struct ubifs_info *c1 = data; struct ubifs_info *c = sb->s_fs_info; return c->vi.cdev == c1->vi.cdev; } See the problem? Mainline has static void kill_ubifs_super(struct super_block *s) { struct ubifs_info *c = s->s_fs_info; kill_anon_super(s); kfree(c); } and void kill_anon_super(struct super_block *sb) { dev_t dev = sb->s_dev; generic_shutdown_super(sb); kill_super_notify(sb); free_anon_bdev(dev); } That removes the superblock from the list of instances before its ->s_fs_info is freed. In your branch removal happens here: if (fs->shutdown_sb) fs->shutdown_sb(s); generic_shutdown_super(s); if (fs->free_sb) fs->free_sb(s); kill_super_notify(s); That comes *after* ubifs_free_sb() has freed ->s_fs_info. And there's nothing to stop ubifs_mount() (on a completely unrelated device) to get called right at that moment. Doing the sget() call quoted above. Now, in sget() we have hlist_for_each_entry(old, &type->fs_supers, s_instances) { if (!test(old, data)) and that will hit sb_test(old, data), with old being a superblock still in ->fs_supers, but with ->s_fs_info already freed. So in sb_test() we have c equal to old->s_fs_info and return c->vi.cdev == c1->vi.cdev; is a bloody use after free. Here we are unlikely to get fucked over - it's a plain fetch from freed object. If you look at e.g. nfs, you'll see a lot more than that - pointer chasing from freed (and possibly reused) object. The only difference is that there you have sget_fc() instead of sget() - same loop anyway. The bottom line: in the form it is posted, your series reintroduces the class of UAF that had been added by taking removal from the instances list out of generic_shutdown_super() and then papered over by adding that kill_super_notify() into kill_anon_super(). And frankly, I believe that the root cause is the insistence that list removal should happen after generic_shutdown_super(). Sure, you want the superblock to serve as bdev holder, which leads to fun with -EBUSY if mount comes while umount still hadn't closed the device. I suspect that it would make a lot more sense to introduce an intermediate state - "held, but will be released in a short while". You already have something similar, but only for the entire disk ->bd_claiming stuff. Add a new primitive (will_release_bdev()), so that attempts to claim the sucker will wait until it gets released instead of failing with -EBUSY. And do *that* before generic_shutdown_super() when unmounting something that is block-based. Allows to bring the list removal back where it used to be, no UAF at all... IMO that direction is a lot more promising.