Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp159342rdg; Thu, 12 Oct 2023 01:39:17 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEkegnqpY2nP3OENZT0ESch7C2KtmWLT1XZ4USavirYdfZsQL0FAgavKK3d+ALBoq7Y2hi2 X-Received: by 2002:a05:6870:1708:b0:1ba:caf2:acc3 with SMTP id h8-20020a056870170800b001bacaf2acc3mr28393684oae.5.1697099957602; Thu, 12 Oct 2023 01:39:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697099957; cv=none; d=google.com; s=arc-20160816; b=ltZVJWAN+XPQN0lf+NPxxgjTAZPzyXyf5NT7z8MbOb4QuvFdHvZWF8ObYs0HZAcCHL XPXvapatKfozXw5WDCptjCg8AvsiN++LgwWH2UnTsmE0vxlU3BBWl74nClV5EhNBBqf3 1eVL9UVfUI+/jhzR/6evZUOALjaCCNMWWt5+WZ6258Ih8paxWV1BrkKmfqz8jGKM/+jC DpBfEk4vfBuoHTaGIC27SnLzZfUTxOEkjF6+8Jhn+mxcGNgQe2Fc6j4e+KvLfHmeEV8b FsubRey62QU59x4T9nRj4bFS7Pie3umuglc5JYxz8bkCB6SRoMi0XHLcBG72n358wVge A2VA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature :dkim-signature; bh=8c6j9VDLhFYc5XmunB4hBYMK+2M9al9NYLUoZ11ZYVE=; fh=d6HySRFzFIv2ViptRxvN1qKFx7PzXRcHgsaBiool4Mk=; b=LEO0Eb+StLy90AvHkpkzAi5ZZjjvSItdDyRUr65KcC61jT/dz65ulBxN7NWaHgSTIt Sd2I0a6JEn00bFj6R8dfys55mYF8nnZsw20+ie6QFv7xm8x+wIjCELGGcpxGsjyM1VAc NNTJS1Jg4SGc76CsxNw16NZ+/UKORNh9QQjUu6xE0DfM2GJ1BN9+A+kahNjFbhFUT1o5 EI5mUMKhhPFyzRXKItFv73o9UykDCAaN3lHGE7i+5bpIIT8OX0knh77M9SdElKgp53OC mIwpY0fEihVV+pu1gOu+0xri/j67oFwOk9ClD4pVOXVWByLsnCMG1BMoXx9y5PKFME00 D85g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=Wl+Ozvuj; dkim=neutral (no key) header.i=@suse.cz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from fry.vger.email (fry.vger.email. [23.128.96.38]) by mx.google.com with ESMTPS id x2-20020a63cc02000000b005a0737404a7si1879219pgf.258.2023.10.12.01.39.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Oct 2023 01:39:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) client-ip=23.128.96.38; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=Wl+Ozvuj; dkim=neutral (no key) header.i=@suse.cz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 832B080AB5AE; Thu, 12 Oct 2023 01:39:14 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235451AbjJLIjC (ORCPT + 99 others); Thu, 12 Oct 2023 04:39:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46412 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235465AbjJLIjB (ORCPT ); Thu, 12 Oct 2023 04:39:01 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D6DF690; Thu, 12 Oct 2023 01:38:59 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 8BBC91F74C; Thu, 12 Oct 2023 08:38:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1697099938; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=8c6j9VDLhFYc5XmunB4hBYMK+2M9al9NYLUoZ11ZYVE=; b=Wl+Ozvuj8Nk9C7oDWauDDMtBBusebE03rVPeCwDvmJYrHMkxfXOBPt0zc63Nr50YITICVE Xc/wXLDUGEX9TYzBQy6WMirJ6fYNPooqR4EDMFOvfpLJGYmQ2WaSHcsYxFRTMTlI5jJtoT vpz+tCQoL9srkPNQBkiIETjg2bz2zgU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1697099938; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=8c6j9VDLhFYc5XmunB4hBYMK+2M9al9NYLUoZ11ZYVE=; b=OqpZ1G7FQHL2iZ2KYqxZhwCiDkq/qYL7Juof0mRlQdToTCF/UL3aew1AKtlnygFIL9OYWc 8WHEHcon6A+TfqBw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 7BB79139F9; Thu, 12 Oct 2023 08:38:58 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id sckdHqKwJ2XPUgAAMHmgww (envelope-from ); Thu, 12 Oct 2023 08:38:58 +0000 Received: by quack3.suse.cz (Postfix, from userid 1000) id E8078A06B0; Thu, 12 Oct 2023 10:38:57 +0200 (CEST) Date: Thu, 12 Oct 2023 10:38:57 +0200 From: Jan Kara To: Lorenzo Stoakes Cc: Jan Kara , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Mike Kravetz , Muchun Song , Alexander Viro , Christian Brauner , Matthew Wilcox , Hugh Dickins , Andy Lutomirski , linux-fsdevel@vger.kernel.org, bpf@vger.kernel.org Subject: Re: [PATCH v3 3/3] mm: enforce the mapping_map_writable() check after call_mmap() Message-ID: <20231012083857.ty66retpyhxkaem3@quack3> References: <20231011094627.3xohlpe4gm2idszm@quack3> <512d8089-759c-47b7-864d-f4a38a9eacf3@lucifer.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <512d8089-759c-47b7-864d-f4a38a9eacf3@lucifer.local> X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Thu, 12 Oct 2023 01:39:14 -0700 (PDT) On Wed 11-10-23 19:14:10, Lorenzo Stoakes wrote: > On Wed, Oct 11, 2023 at 11:46:27AM +0200, Jan Kara wrote: > > On Sat 07-10-23 21:51:01, Lorenzo Stoakes wrote: > > > In order for an F_SEAL_WRITE sealed memfd mapping to have an opportunity to > > > clear VM_MAYWRITE in seal_check_write() we must be able to invoke either > > > the shmem_mmap() or hugetlbfs_file_mmap() f_ops->mmap() handler to do so. > > > > > > We would otherwise fail the mapping_map_writable() check before we had > > > the opportunity to clear VM_MAYWRITE. > > > > > > However, the existing logic in mmap_region() performs this check BEFORE > > > calling call_mmap() (which invokes file->f_ops->mmap()). We must enforce > > > this check AFTER the function call. > > > > > > In order to avoid any risk of breaking call_mmap() handlers which assume > > > this will have been done first, we continue to mark the file writable > > > first, simply deferring enforcement of it failing until afterwards. > > > > > > This enables mmap(..., PROT_READ, MAP_SHARED, fd, 0) mappings for memfd's > > > sealed via F_SEAL_WRITE to succeed, whereas previously they were not > > > permitted. > > > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=217238 > > > Signed-off-by: Lorenzo Stoakes > > > > ... > > > > > diff --git a/mm/mmap.c b/mm/mmap.c > > > index 6f6856b3267a..9fbee92aaaee 100644 > > > --- a/mm/mmap.c > > > +++ b/mm/mmap.c > > > @@ -2767,17 +2767,25 @@ unsigned long mmap_region(struct file *file, unsigned long addr, > > > vma->vm_pgoff = pgoff; > > > > > > if (file) { > > > - if (is_shared_maywrite(vm_flags)) { > > > - error = mapping_map_writable(file->f_mapping); > > > - if (error) > > > - goto free_vma; > > > - } > > > + int writable_error = 0; > > > + > > > + if (vma_is_shared_maywrite(vma)) > > > + writable_error = mapping_map_writable(file->f_mapping); > > > > > > vma->vm_file = get_file(file); > > > error = call_mmap(file, vma); > > > if (error) > > > goto unmap_and_free_vma; > > > > > > + /* > > > + * call_mmap() may have changed VMA flags, so retry this check > > > + * if it failed before. > > > + */ > > > + if (writable_error && vma_is_shared_maywrite(vma)) { > > > + error = writable_error; > > > + goto close_and_free_vma; > > > + } > > > > Hum, this doesn't quite give me a peace of mind ;). One bug I can see is > > that if call_mmap() drops the VM_MAYWRITE flag, we seem to forget to drop > > i_mmap_writeable counter here? > > This wouldn't be applicable in the F_SEAL_WRITE case, as the > i_mmap_writable counter would already have been decremented, and thus an > error would arise causing no further decrement, and everything would work > fine. > > It'd be very odd for something to be writable here but the driver to make > it not writable. But we do need to account for this. Yeah, it may be odd but this is indeed what i915 driver appears to be doing in i915_gem_object_mmap(): if (i915_gem_object_is_readonly(obj)) { if (vma->vm_flags & VM_WRITE) { i915_gem_object_put(obj); return -EINVAL; } vm_flags_clear(vma, VM_MAYWRITE); } > > I've checked why your v2 version broke i915 and I think the reason maybe > > has nothing to do with i915. Just in case call_mmap() failed, it ended up > > jumping to unmap_and_free_vma which calls mapping_unmap_writable() but we > > didn't call mapping_map_writable() yet so the counter became imbalanced. > > yeah that must be the cause, I thought perhaps somehow > __remove_shared_vm_struct() got invoked by i915_gem_mmap() but I didn't > trace it through to see if it was possible. > > Looking at it again, i don't think that is possible, as we hold a mmap/vma > write lock, and the only operations that can cause > __remove_shared_vm_struct() to run are things that would not be able to do > so with this lock held. > > > So I'd be for returning to v2 version, just fix up the error handling > > paths... > > So in conclusion, I agree, this is the better approach. Will respin in v4. Thanks! Honza -- Jan Kara SUSE Labs, CR