Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp758915rwr; Thu, 4 May 2023 09:11:58 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ64pspqKp8Ee3GHySAHLCv7e4ob3ZGKgsDnL55hBkzhQ4hrJU974YlHfb9QEMjTmb3unyEG X-Received: by 2002:a17:902:8c91:b0:1aa:f446:d518 with SMTP id t17-20020a1709028c9100b001aaf446d518mr4115291plo.16.1683216718281; Thu, 04 May 2023 09:11:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683216718; cv=none; d=google.com; s=arc-20160816; b=BfG0+iFiQti3M/7fSvR2AZi8rF2ixaXKlDssW15c7OS4qe9XrehjmyF4TsQcw8pt3f OQnYGqL6EdUvHqbaD1Cm/6ywtlM71VkbJyd3edGzIDO4T3MfkQKi0R3EdhgUDYuzICfm dZUzurlH+Mg1P18EUkGSYXlGVlb/JGXC0Yx4a28o3ExduNbrCIcyaQo4qBoZJ4EH2qm1 EwsS5GKZgBPPm9Ep6kS1DuK7hOTU2d1gNTHk8lUiSpY2uqgY0dCBC/+oCDe+W0a+6U2p 97IwHTPbQUOc32Txiy33qtJwvQz9qf4fmb5B6y+J6vGByZUkhOjwomizf3TyCsfMTQlw MBHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=mO/FCTJ0lgmDzfuu+m/nGF6TmLm95y7sMUjn9Rtpsbc=; b=NzYRP7dqm6WufYuCR5mPVbjkEXnOvg7dmQELCjHkbYVUm3DydygLcI5MECR+tci02v I8138KOIbjZYVFI9Vq9LyVxwdbTTRO3pinQMwT6CLYQoEMFxh0VAupPrQDEDTzFgsneo oDBgEIt3pw2SggvBDa4lhRCUXAHE6UkSjtDqNroJShwA1ZKucQivic2ygDjPAVRFbRqW vDZloO5yohSkUk6KRKjkwDYu4BII3hI7ZV77PzsyI8qzdaD5/ZjB/90L1TEz5ozpI/1/ OsyxegCPyvpAw6FCQMWx6DNyXa8mxoKilf0fUKSDC9FnxpM8GdWUFL4euRrIqU5DUmvC rO8g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=IW8KU+ay; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j8-20020a17090aeb0800b0024c1ddfa1bfsi4173687pjz.92.2023.05.04.09.11.45; Thu, 04 May 2023 09:11:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=IW8KU+ay; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231351AbjEDPs1 (ORCPT + 99 others); Thu, 4 May 2023 11:48:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230147AbjEDPs0 (ORCPT ); Thu, 4 May 2023 11:48:26 -0400 Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4778949D4; Thu, 4 May 2023 08:48:24 -0700 (PDT) Received: by mail-wm1-x32a.google.com with SMTP id 5b1f17b1804b1-3f315735514so69122235e9.1; Thu, 04 May 2023 08:48:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1683215303; x=1685807303; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=mO/FCTJ0lgmDzfuu+m/nGF6TmLm95y7sMUjn9Rtpsbc=; b=IW8KU+aybAdA7IkSMNM+90mqkc3W0LlV/6WZVdi1Ij9oLCE08FNtokKMDzc1ncrLAL e9v6oMPVkwsk8ZnkBjP7KJeoH2xnx5GD+ac0+az7WEG38+OUdHn6LSzwQhrc31gZTbPW 7424WGNFQyfLBkeGK87MdYzTQ8RUrcD70Mz/b97RvPRNISjLmbwmFQOLhZ0m6mDLiI9w nDnwbRiTsPG/oMYSH9mh5rd8jCuf0e8GOYVb0OnMmi6iTm11fzASWgtnIc6jrhLqt1CW 2UpOnOdmDGViLDf8ZuvcbESF1+Y1+rRDV3VUABszQg1EGn6DX/IkpSdrv91deyonsNrU zRjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683215303; x=1685807303; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=mO/FCTJ0lgmDzfuu+m/nGF6TmLm95y7sMUjn9Rtpsbc=; b=j/qSv+UrVM+1JM6eFoo++/TOtNPiZxqKXHen0isxYuq3ELlSbioZrTPBRY7EfkSthu aaR01bmypPj+rZsgPfmRLz5J7Ab82vRho6dJHrUe+wQmfV6pLTx0B47iNKiRdXztP5me yKbvkc/dZ/XjCqQ3K2p1k8sfS0lxyLeoq5ql5w3WTfF4gwLvrKF8vfvkg1Ac3XbIsqni 9iMvFRxhIhbU652FmssaDK+1SNa8i0ZLJ+7GdQm6ABWiPr3HPDugp9kTz8aK70do7maf /NgU4sqdDB8ePrSIWG/ui9Egx7FSWIVmj6YJ3TM0s2cqod4mbsbSMonKiAvQSFaCy+mD 6A9A== X-Gm-Message-State: AC+VfDy85kBnyL331y5xWjwn0O6rbbUP1axgcH2ukMI7f4+RFxRwI0gc pRUaunnLoQn3Nq2r/wct74w= X-Received: by 2002:a5d:63cd:0:b0:306:31cb:25fb with SMTP id c13-20020a5d63cd000000b0030631cb25fbmr2559900wrw.17.1683215302436; Thu, 04 May 2023 08:48:22 -0700 (PDT) Received: from localhost (host86-156-84-164.range86-156.btcentralplus.com. [86.156.84.164]) by smtp.gmail.com with ESMTPSA id i17-20020a5d6311000000b00306ec04f060sm2162821wru.107.2023.05.04.08.48.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 May 2023 08:48:21 -0700 (PDT) Date: Thu, 4 May 2023 16:48:20 +0100 From: Lorenzo Stoakes To: David Hildenbrand Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Jason Gunthorpe , Jens Axboe , Matthew Wilcox , Dennis Dalessandro , Leon Romanovsky , Christian Benvenuti , Nelson Escobar , Bernard Metzler , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , Bjorn Topel , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , "David S . Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Christian Brauner , Richard Cochran , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , linux-fsdevel@vger.kernel.org, linux-perf-users@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, Oleg Nesterov , Jason Gunthorpe , John Hubbard , Jan Kara , "Kirill A . Shutemov" , Pavel Begunkov , Mika Penttila , Dave Chinner , Theodore Ts'o , Peter Xu , Matthew Rosato , "Paul E . McKenney" , Christian Borntraeger Subject: Re: [PATCH v8 3/3] mm/gup: disallow FOLL_LONGTERM GUP-fast writing to file-backed mappings Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 04, 2023 at 05:04:34PM +0200, David Hildenbrand wrote: > [...] > > > +static bool folio_fast_pin_allowed(struct folio *folio, unsigned int flags) > > +{ > > + struct address_space *mapping; > > + unsigned long mapping_flags; > > + > > + /* > > + * If we aren't pinning then no problematic write can occur. A long term > > + * pin is the most egregious case so this is the one we disallow. > > + */ > > + if ((flags & (FOLL_PIN | FOLL_LONGTERM | FOLL_WRITE)) != > > + (FOLL_PIN | FOLL_LONGTERM | FOLL_WRITE)) > > + return true; > > + > > + /* The folio is pinned, so we can safely access folio fields. */ > > + > > + /* Neither of these should be possible, but check to be sure. */ > > You can easily have anon pages that are at the swapcache at this point > (especially, because this function is called before our unsharing checks), > the comment is misleading. Ack will update. > > And there is nothing wrong about pinning an anon page that's still in the > swapcache. The following folio_test_anon() check will allow them. > > The check made sense in page_mapping(), but here it's not required. Waaaaaaaaaait a second, you were saying before:- "Folios in the swap cache return the swap mapping" -- you might disallow pinning anonymous pages that are in the swap cache. I recall that there are corner cases where we can end up with an anon page that's mapped writable but still in the swap cache ... so you'd fallback to the GUP slow path (acceptable for these corner cases, I guess), however especially the comment is a bit misleading then. So are we allowing or disallowing pinning anon swap cache pages? :P I mean slow path would allow them if they are just marked anon so I'm inclined to allow them. > > I do agree regarding folio_test_slab(), though. Should we WARN in case we > would have one? > > if (WARN_ON_ONCE(folio_test_slab(folio))) > return false; > God help us if we have a slab page at this point, so agreed worth doing, it would surely have to arise from some dreadful bug/memory corruption. > > + if (unlikely(folio_test_slab(folio) || folio_test_swapcache(folio))) > > + return false; > > + > > + /* hugetlb mappings do not require dirty-tracking. */ > > + if (folio_test_hugetlb(folio)) > > + return true; > > + > > + /* > > + * GUP-fast disables IRQs. When IRQS are disabled, RCU grace periods > > + * cannot proceed, which means no actions performed under RCU can > > + * proceed either. > > + * > > + * inodes and thus their mappings are freed under RCU, which means the > > + * mapping cannot be freed beneath us and thus we can safely dereference > > + * it. > > + */ > > + lockdep_assert_irqs_disabled(); > > + > > + /* > > + * However, there may be operations which _alter_ the mapping, so ensure > > + * we read it once and only once. > > + */ > > + mapping = READ_ONCE(folio->mapping); > > + > > + /* > > + * The mapping may have been truncated, in any case we cannot determine > > + * if this mapping is safe - fall back to slow path to determine how to > > + * proceed. > > + */ > > + if (!mapping) > > + return false; > > + > > + /* Anonymous folios are fine, other non-file backed cases are not. */ > > + mapping_flags = (unsigned long)mapping & PAGE_MAPPING_FLAGS; > > + if (mapping_flags) > > + return mapping_flags == PAGE_MAPPING_ANON; > > KSM pages are also (shared) anonymous folios, and that check would fail -- > which is ok (the following unsharing checks rejects long-term pinning them), > but a bit inconstent with your comment and folio_test_anon(). > > It would be more consistent (with your comment and also the folio_test_anon > implementation) to have here: > > return mapping_flags & PAGE_MAPPING_ANON; > I explicitly excluded KSM out of fear that could be some breakage given they're wrprotect'd + expected to CoW though? But I guess you mean they'd get picked up by the unshare and so it doesn't matter + we wouldn't want to exclude an PG_anon_exclusive case? I'll make the change in any case given the unshare check! I notice the gup_huge_pgd() doesn't do an unshare but I mean, a PGD-sized huge page probably isn't going to be CoW'd :P > > + > > + /* > > + * At this point, we know the mapping is non-null and points to an > > + * address_space object. The only remaining whitelisted file system is > > + * shmem. > > + */ > > + return shmem_mapping(mapping); > > +} > > + > > In general, LGTM > > Acked-by: David Hildenbrand > Thanks! Will respin, addressing your comments and addressing the issue the kernel bot picked up with placement in the appropriate #ifdef's and send out a v9 shortly. > -- > Thanks, > > David / dhildenb >