Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp3678247rdh; Thu, 28 Sep 2023 21:36:08 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF3RNgNURdOsIfA69C1Hn2JVZD2hDRyQvMiTIA0Fw8mLFWE10RqI8JLS5H9ZlW8U3XvdIc8 X-Received: by 2002:a17:903:1246:b0:1c3:6163:210a with SMTP id u6-20020a170903124600b001c36163210amr3089620plh.60.1695962167579; Thu, 28 Sep 2023 21:36:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695962167; cv=none; d=google.com; s=arc-20160816; b=eL5Ywz9YzOlR8K3pDd1hKU/XkLSTOSY/KLUvZTNwYPsxuGtzH4hD6Hh+0VgEJvKYOa a3axvMMYegfAZEfD1KH2IGOd1DPe7/ckl/yGRAPxVFKq494G0Uc7hdycYvC0HLQ9gzCN Ycoc8dVN1zzakH5qUUnEDVOELJr7LPDmu/khXFuu3DsAI6GlncQb44OdiDxN6WbHNVtW ZaGcMQEh3cgSfdLEvYaHZYbtMMWLFo+SqgQuRRj+UwZr88WgWTcHKpbC416gzoz+vV11 RaRjr2mAcsGRY+7WxeNO+HZnXQMQA0gPrHOafSdu8pkSQBssk5BcM261sHoAwHsTICZS 5RhA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=tHgQXBAFMtt2xjDagTnzi+QIWg4ufIaPSkBrUwwCR20=; fh=3XYCKbGtpDXdT2CCOIcKMaEb8y1IHgwSi0yoJXDWSeg=; b=vwhzQ47qRSt80awBS7XipYH0uZtFzAuQP1ldYumcFswVjh4lPBFyJ9xvs1rL6gKc/n /6L5gDLrf/K9gBHIU9EQYvL/lXpCTonzmk6EmOr1cXYqX1LZar46gMZBhVbcS6813qqf rF2l6ey4cpFLODTWcaWinZiWy6rCZl2PDVXG3FP+ILsZ676FqVFM6V8qG9tQsVurVhBV RMjMD1g2ph8kz6XDn0+qCa7IGvcL0RtEX7sfNpqyyEL3zJDFMpbO0lrUdCulKY2cjHOa gFoL+lEj/Q38uj7HST6JmNg0n9r0SQdOOziBDnRG1WjjaUka9Gha0GzJdo9gdxH312uX DXzg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=c+frV9aY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id t5-20020a170902e1c500b001c60de17b50si13073276pla.182.2023.09.28.21.36.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Sep 2023 21:36:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=c+frV9aY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 5F99B80BB23A; Thu, 28 Sep 2023 11:35:46 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232132AbjI1SfT (ORCPT + 99 others); Thu, 28 Sep 2023 14:35:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46670 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232102AbjI1SfR (ORCPT ); Thu, 28 Sep 2023 14:35:17 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6744D1AC for ; Thu, 28 Sep 2023 11:34:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695926065; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=tHgQXBAFMtt2xjDagTnzi+QIWg4ufIaPSkBrUwwCR20=; b=c+frV9aYUCZxINBBhgkusJi0oYAOXq/QDlqCjplcMjAbZarMLm6rahrVWGqcXqo5sdNLyU UUtzw+NIkzajm2kpG1Jue4m8f1uk+zCO6mZLTpPuY/wbE67Z0DJdMTOCeQnmRjUJ1/g1mn lmzTQ5xSBMn/el+6a6K4epwJR8kMp0U= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-586-0WQWASJUPGiOX-26Zh39hQ-1; Thu, 28 Sep 2023 14:34:24 -0400 X-MC-Unique: 0WQWASJUPGiOX-26Zh39hQ-1 Received: by mail-qk1-f199.google.com with SMTP id af79cd13be357-774292de453so295269985a.0 for ; Thu, 28 Sep 2023 11:34:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695926063; x=1696530863; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=tHgQXBAFMtt2xjDagTnzi+QIWg4ufIaPSkBrUwwCR20=; b=G8cbZIH6Q5W2AsNIAXtH3GNjIczeDAkcT8YzhueX8QQ3xGZ2goEBj3qz8g/I7DKgyZ xGAauhTw9RZK7M8JKxZWatkn/cPK4LkBORoBAuat+UOYAjoTo0zscUqRFeMRgVZEHY/a 244PLNiZBgrlJOmalxlEHP+pN22nsM7xvodcxRLNVpvhJx2QTx/R1NYrCmsdiF/j3uOw fVbo3aDkNVQ/pQiZ4o2WYD3x+OghkAhxU66vnn7GIyy9tBGv6JZIz6pAsSVHRJOF6e7O 70QxjZwIYRjzmVkswlYDVhIA90w+/C+gasBaBtx9+oy4P3/nSwcqKPJhB7jofANPJLIm rdIg== X-Gm-Message-State: AOJu0YztH72xDb3gIAfvHYRbUQiWgXe8yrjOBvy3gqeRjHVOwpi5c4Qy DgZWxdzSpdKwKOuOt85v4Olcs82cec7RC9Nw+QfkvnTsge0NeObBjd7bswZrzWGgweOEb48BC7S Fx3Kj0Mja0f1U1fw90ykNHwuv X-Received: by 2002:a05:620a:46a4:b0:773:ad1f:3d5b with SMTP id bq36-20020a05620a46a400b00773ad1f3d5bmr2297129qkb.0.1695926063523; Thu, 28 Sep 2023 11:34:23 -0700 (PDT) X-Received: by 2002:a05:620a:46a4:b0:773:ad1f:3d5b with SMTP id bq36-20020a05620a46a400b00773ad1f3d5bmr2297110qkb.0.1695926063230; Thu, 28 Sep 2023 11:34:23 -0700 (PDT) Received: from x1n (cpe5c7695f3aee0-cm5c7695f3aede.cpe.net.cable.rogers.com. [99.254.144.39]) by smtp.gmail.com with ESMTPSA id e15-20020a05620a12cf00b007756d233fbdsm1857612qkl.37.2023.09.28.11.34.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Sep 2023 11:34:22 -0700 (PDT) Date: Thu, 28 Sep 2023 14:34:19 -0400 From: Peter Xu To: David Hildenbrand Cc: Jann Horn , Suren Baghdasaryan , akpm@linux-foundation.org, viro@zeniv.linux.org.uk, brauner@kernel.org, shuah@kernel.org, aarcange@redhat.com, lokeshgidra@google.com, hughd@google.com, mhocko@suse.com, axelrasmussen@google.com, rppt@kernel.org, willy@infradead.org, Liam.Howlett@oracle.com, zhangpeng362@huawei.com, bgeffon@google.com, kaleshsingh@google.com, ngeoffray@google.com, jdduke@google.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, kernel-team@android.com Subject: Re: [PATCH v2 2/3] userfaultfd: UFFDIO_REMAP uABI Message-ID: References: <20230923013148.1390521-1-surenb@google.com> <20230923013148.1390521-3-surenb@google.com> <03f95e90-82bd-6ee2-7c0d-d4dc5d3e15ee@redhat.com> <98b21e78-a90d-8b54-3659-e9b890be094f@redhat.com> <85e5390c-660c-ef9e-b415-00ee71bc5cbf@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <85e5390c-660c-ef9e-b415-00ee71bc5cbf@redhat.com> X-Spam-Status: No, score=0.6 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_SORBS_WEB,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Thu, 28 Sep 2023 11:35:46 -0700 (PDT) On Thu, Sep 28, 2023 at 07:51:18PM +0200, David Hildenbrand wrote: > On 28.09.23 19:21, Peter Xu wrote: > > On Thu, Sep 28, 2023 at 07:05:40PM +0200, David Hildenbrand wrote: > > > As described as reply to v1, without fork() and KSM, the PAE bit should > > > stick around. If that's not the case, we should investigate why. > > > > > > If we ever support the post-fork case (which the comment above remap_pages() > > > excludes) we'll need good motivation why we'd want to make this > > > overly-complicated feature even more complicated. > > > > The problem is DONTFORK is only a suggestion, but not yet restricted. If > > someone reaches on top of some !PAE page on src it'll never gonna proceed > > and keep failing, iiuc. > > Yes. It won't work if you fork() and not use DONTFORK on the src VMA. We > should document that as a limitation. > > For example, you could return an error to the user that can just call > UFFDIO_COPY. (or to the UFFDIO_COPY from inside uffd code, but that's > probably ugly as well). We could indeed provide some special errno perhaps upon the PAE check, then document it explicitly in the man page and suggest resolutions (like DONTFORK) when user hit it. > > > > > do_wp_page() doesn't have that issue of accuracy only because one round of > > CoW will just allocate a new page with PAE set guaranteed, which is pretty > > much self-heal and unnoticed. > > Yes. But it might have to copy, at which point the whole optimization of > remap is gone :) Right, but that's fine IMHO because it should still be very corner case, definitely not expected to be the majority to start impact the performance results. > > > > > So it'll be great if we can have similar self-heal way for PAE. If not, I > > think it's still fine we just always fail on !PAE src pages, but then maybe > > we should let the user know what's wrong, e.g., the user can just forgot to > > apply DONTFORK then forked. And then the user hits error and don't know > > what happened. Probably at least we should document it well in man pages. > > > Yes, exactly. > > > Another option can be we keep using folio_mapcount() for pte, and another > > helper (perhaps: _nr_pages_mapped==COMPOUND_MAPPED && _entire_mapcount==1) > > for thp. But I know that's not ideal either. > > As long as we only set the pte writable if PAE is set, we're good from a CVE > perspective. The other part is just simplicity of avoiding all these > mapcount+swapcount games where possible. > > (one day folio_mapcount() might be faster -- I'm still working on that patch > in the bigger picture of handling PTE-mapped THP better) Sure. For now as long as we're crystal clear on the possibility of inaccuracy of PAE, it never hits besides fork() && !DONTFORK, and properly document it, then sounds good here. Thanks, -- Peter Xu