Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp7148413rwr; Tue, 2 May 2023 10:05:43 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6RSjeHF0oRa1x2jwxWJIto/jJAu2LJvnMoZ1qbFRyPmQg/WzP35e/7yoljDDUKu1FC6b1E X-Received: by 2002:a05:6808:488:b0:38e:d6a4:69c0 with SMTP id z8-20020a056808048800b0038ed6a469c0mr9169627oid.13.1683047142873; Tue, 02 May 2023 10:05:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683047142; cv=none; d=google.com; s=arc-20160816; b=vRElgra43rgQ8wdvnKISKvF4HsIV9va7Dm181Z48FDElIKnTq0hlkkXems3ea657mQ s4XCf3E3Tq+14hq/+jVpHMHVUc/E6SVUWcgs0yzr1Ux7YkiEBPmf+5awQyGrWCxR0bTV oF7HHMfUMr3+57pUhtmy94pmNiFyB+tgP6rwjtUjZpmcd41Jd72L5HR8K68n4UtA1TtI F3obLfF66aABNt0xOgz0OBbFncLlr92Qk5Uk3sCito/kEwkw+t/b/V9ARkDC8s0qG4zF DO66dY4oKE9M1O0lSDprOCio0UQh4FVN5gE+p+2LDlDD0ZKCGkVK4syXmWoPyVW01DnR R6pQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=H/XhiryUevVm3Cw0bRPhAFmwGi+VKwNTD6d8PLLNIJU=; b=iNs3fVwVNL1+8J5IpVP0XQFGCa4Y20CR1Sb4B4y6m8tzdYytU3vuGkM0CudussKSza HiEtjq/T3fyge6XagGy5A0Mo3SlvfWw33VxnTKHo7Df+fLbbL+BLrSlyCOxCrldJUNYQ PSzej6b8uREH0jNzf4bk4ByOny857v/M3g2x4gK3/obbb+YtHoTBjktMczT6aQY0Ta5x ayTOQFNSc5oNcykBOKLu+UoFTIb9gL9rSRRhy89Hz3gZLWGl1fcq8emJs4sKTtVBLyrN xCbCxfYIudODMo8MChazQLihNJgrK+NYQI99hSTqosj695XEwK1AqB+sT3tgup3bQGQ9 07fw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=CRk7F9cD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u15-20020a056808114f00b00384570b7fc1si23848661oiu.191.2023.05.02.10.05.27; Tue, 02 May 2023 10:05:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=CRk7F9cD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234355AbjEBQxy (ORCPT + 99 others); Tue, 2 May 2023 12:53:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39976 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234352AbjEBQxv (ORCPT ); Tue, 2 May 2023 12:53:51 -0400 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 272301FFF; Tue, 2 May 2023 09:53:50 -0700 (PDT) Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-3f315712406so178263125e9.0; Tue, 02 May 2023 09:53:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1683046428; x=1685638428; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=H/XhiryUevVm3Cw0bRPhAFmwGi+VKwNTD6d8PLLNIJU=; b=CRk7F9cD3K3ahmGmI3nRMbzR6bOv/GR4Lw1i0CqadZ2gfc3QYRy4U3WMKo6dHYddaw F8xPCOiOdqT1rEQQp2FPHuCb88EFVUEVNa8YxWeRDgtLuZVuu6DU2AbkCFYYFncKM+62 trHryjBobn6ZkM3VnwGlqM+tffzT49q3t5FeZzlJiTKDpiRNqHydcmR3m9geS9mWPfXw /K2t8KAtmSDybpVkB2mi0K5H3UYQ7xfBc8AwQKQrZdSEVdOVt2IVg2+1qpZKMHomAsTq sR2FWuit4tnLak2PzqyaZV8Db616SiuaVBe7DjzxNVnSF2Pu502pJOHmUJyi2WtvtsZV c3oA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683046428; x=1685638428; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=H/XhiryUevVm3Cw0bRPhAFmwGi+VKwNTD6d8PLLNIJU=; b=dBQ4eCNXU3l+Yse88A6ZTaHirCCDRvAA5+b5u4/fivQSTKnIL3IrC9CCDg8OY3IIJ7 oCp9bu93z22/PFAT9wmcq8IeJnEUxSUrC9Q1MltVW+htWE+fLst+oHyyWF5E+gYZNMxs FMdt1DSSaNq77ULmkSYYnjDw+tnippRpoIgtDpFh7sWvRYjiKBgaYAwuNX8PRbCGozIp TOvlvAW+kr0MsAxq4nxKG2IW617qaDymDdHVYeeK4HHrI6LJAhKy/SfXJcrGwFgLCGVs GxKiOpuvSlbQVPE5a1LNKYLtHbICcYIPT91utMIb/0zd+sjLF/d14AGQMcF9coSclNT2 WJSw== X-Gm-Message-State: AC+VfDz2zzqfA1/OCwdOjP/dNdNRh5wLnACJA3IuJeKkFjw6enxxiY4Z v/l/CSr4hGju/r76dVmmt1U= X-Received: by 2002:a05:600c:82c5:b0:3f1:76d9:c788 with SMTP id eo5-20020a05600c82c500b003f176d9c788mr16116146wmb.8.1683046428364; Tue, 02 May 2023 09:53:48 -0700 (PDT) Received: from localhost (host86-156-84-164.range86-156.btcentralplus.com. [86.156.84.164]) by smtp.gmail.com with ESMTPSA id g10-20020a7bc4ca000000b003f171234a08sm35756623wmk.20.2023.05.02.09.53.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 May 2023 09:53:47 -0700 (PDT) Date: Tue, 2 May 2023 17:53:46 +0100 From: Lorenzo Stoakes To: David Hildenbrand Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Jason Gunthorpe , Jens Axboe , Matthew Wilcox , Dennis Dalessandro , Leon Romanovsky , Christian Benvenuti , Nelson Escobar , Bernard Metzler , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , Bjorn Topel , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , "David S . Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Christian Brauner , Richard Cochran , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , linux-fsdevel@vger.kernel.org, linux-perf-users@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, Oleg Nesterov , Jason Gunthorpe , John Hubbard , Jan Kara , "Kirill A . Shutemov" , Pavel Begunkov , Mika Penttila , Dave Chinner , Theodore Ts'o , Peter Xu , Matthew Rosato , "Paul E . McKenney" , Christian Borntraeger Subject: Re: [PATCH v7 1/3] mm/mmap: separate writenotify and dirty tracking logic Message-ID: References: <72a90af5a9e4445a33ae44efa710f112c2694cb1.1683044162.git.lstoakes@gmail.com> <56696a72-24fa-958e-e6a1-7a17c9e54081@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <56696a72-24fa-958e-e6a1-7a17c9e54081@redhat.com> X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 02, 2023 at 06:38:53PM +0200, David Hildenbrand wrote: > On 02.05.23 18:34, Lorenzo Stoakes wrote: > > vma_wants_writenotify() is specifically intended for setting PTE page table > > flags, accounting for existing PTE flag state and whether that might > > already be read-only while mixing this check with a check whether the > > filesystem performs dirty tracking. > > > > Separate out the notions of dirty tracking and a PTE write notify checking > > in order that we can invoke the dirty tracking check from elsewhere. > > > > Note that this change introduces a very small duplicate check of the > > separated out vm_ops_needs_writenotify(). This is necessary to avoid making > > vma_needs_dirty_tracking() needlessly complicated (e.g. passing a > > check_writenotify flag or having it assume this check was already > > performed). This is such a small check that it doesn't seem too egregious > > to do this. > > > > Signed-off-by: Lorenzo Stoakes > > Reviewed-by: John Hubbard > > Reviewed-by: Mika Penttil? > > Reviewed-by: Jan Kara > > Reviewed-by: Jason Gunthorpe > > --- > > include/linux/mm.h | 1 + > > mm/mmap.c | 36 +++++++++++++++++++++++++++--------- > > 2 files changed, 28 insertions(+), 9 deletions(-) > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 27ce77080c79..7b1d4e7393ef 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -2422,6 +2422,7 @@ extern unsigned long move_page_tables(struct vm_area_struct *vma, > > #define MM_CP_UFFD_WP_ALL (MM_CP_UFFD_WP | \ > > MM_CP_UFFD_WP_RESOLVE) > > +bool vma_needs_dirty_tracking(struct vm_area_struct *vma); > > int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot); > > static inline bool vma_wants_manual_pte_write_upgrade(struct vm_area_struct *vma) > > { > > diff --git a/mm/mmap.c b/mm/mmap.c > > index 5522130ae606..295c5f2e9bd9 100644 > > --- a/mm/mmap.c > > +++ b/mm/mmap.c > > @@ -1475,6 +1475,31 @@ SYSCALL_DEFINE1(old_mmap, struct mmap_arg_struct __user *, arg) > > } > > #endif /* __ARCH_WANT_SYS_OLD_MMAP */ > > +/* Do VMA operations imply write notify is required? */ > > +static bool vm_ops_needs_writenotify(const struct vm_operations_struct *vm_ops) > > +{ > > + return vm_ops && (vm_ops->page_mkwrite || vm_ops->pfn_mkwrite); > > +} > > + > > +/* > > + * Does this VMA require the underlying folios to have their dirty state > > + * tracked? > > + */ > > +bool vma_needs_dirty_tracking(struct vm_area_struct *vma) > > +{ > > Sorry for not noticing this earlier, but ... pints_owed++ > > what about MAP_PRIVATE mappings? When we write, we populate an anon page, > which will work as expected ... because we don't have to notify the fs? > > I think you really also want the "If it was private or non-writable, the > write bit is already clear */" part as well and remove "false" in that case. > Not sure a 'write bit is already clear' case is relevant to checking whether a filesystem dirty tracks? That seems specific entirely to the page table bits. That's why I didn't include it, A !VM_WRITE shouldn't be GUP-writable except for FOLL_FORCE, and that surely could be problematic if VM_MAYWRITE later? Thinking about it though a !VM_SHARE should probably can be safely assumed to not be dirty-trackable, so we probably do need to add a check for !VM_SHARED -> !vma_needs_dirty_tracking > Or was there a good reason to disallow private mappings as well? > Until the page is CoW'd walking the page tables will get you to the page cache page right? This was the reason I (perhaps rather too quickly) felt MAP_PRIVATE should be excluded. However a FOLL_WRITE would trigger CoW... and then we'd be trivially OK. So yeah, ok perhaps I dismissed that a little too soon. I was concerned about some sort of egregious FOLL_FORCE case where somehow we'd end up with the page cache folio. But actually, that probably can't happen... > -- > Thanks, > > David / dhildenb >