Received: by 2002:a05:6a10:9e8c:0:0:0:0 with SMTP id y12csp230702pxx; Mon, 26 Oct 2020 07:22:48 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyKztAMW57n3fZ0y3yws5k0PlRVyzCaemJdI1/m4EaQPv+sbkXOaz/B7VGIwdlYoSB/OFrJ X-Received: by 2002:a17:906:cc50:: with SMTP id mm16mr16083423ejb.145.1603722168063; Mon, 26 Oct 2020 07:22:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603722168; cv=none; d=google.com; s=arc-20160816; b=kyhjLdnkdg7lE9CHsr6reYe+DcQSSZU78WqnEyfQCKG/z71QU+5zPtMDnoBjHtBzmB unu/ZInHSPHZ5jeaAlRod691wcH6XMAXFpBO3hwuvrwXeQYrwAx6Z2b/5kepHx9UnbVm ei2q29DsT7d7rE1o9pbrWW2gf2tftWZxVpBrHTBBqT8h39OT2J4ISQDD4DYfu8WpHrq3 98auBrbpb2HAEhgUbHhTUhjZ6T3ac2qiXvAHX5/k7XtT+lUGt2cuWhiKrLZXlTctzJud CFQdzn1Y2Vu8jU5xeBufca5YM760Oa1wBEs/J+ZlYuN5TuoQrxW4B1DEAjM4FAyB8rQw SbMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=3S2YjeCUC6K3doI4vgMQKoCwvJ4YN6LMw0KanLSj+uM=; b=ZjFJ/Moz5XGFbPZQdNftpSnyYECtFW1CdQpXAE7YTfMR7OgB3/hCUzfW0C+erGUw2C RKIDKyP3RA4BhP5H4ZhP7Oar9x3ORDlWIBiJ2SG5Bh6T9NJIgL+UA6xYiypbG4URMYx/ AkoFZg0sBGvvhV0EX0eamF5n/cMFw94wVjRLNcLLEx/KQPgEOLuhpB+KmPB+DucKZYOK Lt7VYsUUbQLxFbAfah1A7fkVjccGnf3Y+WFBBCPY+P2YC+oE69KHjGs2W51/PayHf1Tm URfZTJ0dLQIrnk84qb47REYMk6uw9xEHLjizMjyevkJXWt8DVEjKe7qYPv7OLySzvLrV Y+Ww== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=MocRZOBq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n13si7004692eji.447.2020.10.26.07.22.25; Mon, 26 Oct 2020 07:22:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=MocRZOBq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1781071AbgJZOQI (ORCPT + 99 others); Mon, 26 Oct 2020 10:16:08 -0400 Received: from mail-qk1-f196.google.com ([209.85.222.196]:43464 "EHLO mail-qk1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1780366AbgJZOQG (ORCPT ); Mon, 26 Oct 2020 10:16:06 -0400 Received: by mail-qk1-f196.google.com with SMTP id q199so8384390qke.10 for ; Mon, 26 Oct 2020 07:16:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=3S2YjeCUC6K3doI4vgMQKoCwvJ4YN6LMw0KanLSj+uM=; b=MocRZOBqXvmKFTBQqlf9654tnqlf+n6zq+i2QdNLLQNEIAKlRxznKNH8lCVuCZXtx8 sL2hKpGdKOCK17bDT9PJq98n7qUnhsRxnYoL99XDVZ6GR99eNglEwr2LPb973O7RQ7HA pDW4sDBWyUQViJzK19iQsGqbK+MeQAGCD5TQEeKrBoKzCOYX/duYkiXTcQ4Qp8ccSABs Hftqzs+se0kaaq/6hY4MO1EHYC4oBnMYtT0YdyYt/VzKZq3XOZwc/e29IfN7g/5xfmSM QqCOBeeAbUr2ljY3/WF8B6umAyrwtt7+dYB8AO4wemgwHV5cww4O71ZknMmTdweIplAH m9wQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=3S2YjeCUC6K3doI4vgMQKoCwvJ4YN6LMw0KanLSj+uM=; b=AQKQiut5lxQd2bwOBo493fG9nyB6ZagxIrXXSnIAXagofAGJCU0I7buoEoksNE7iPP KHhcqSkWyk9jFTma2QpXAyMhCb+qWFxUlvh00ub/wRrJFhbNNCaYgbXnqfGcAFEFgIvs cFQGJoVaDWuzUIc7oh4VfxYGtLgnK2bxZ8BkaHPGrF7VjothDF1w3MeopzQqbzJTgYlV rm9uQTSbX6zpjVNg8zNsFyJE5eSKrA8ChPgAoJ8eQKa2NZSXWEjqRX44StIlVk6bQH0S DW8LcZ7C8E2fLzf6LI9EjTrBjlhGmU0/KeYeLnl82a6PCc9AlpW5St+GXjsF98B+bFUe I+YA== X-Gm-Message-State: AOAM533ChFlQsXnG9mSqS+/Uh8EgG5zSbW9V0zik/LlB1NsJtERo42bc GbRbX1QRj8H+NdRtk0n1xOKNsA== X-Received: by 2002:a05:620a:849:: with SMTP id u9mr17366693qku.419.1603721762676; Mon, 26 Oct 2020 07:16:02 -0700 (PDT) Received: from ziepe.ca (hlfxns017vw-156-34-48-30.dhcp-dynamic.fibreop.ns.bellaliant.net. [156.34.48.30]) by smtp.gmail.com with ESMTPSA id y77sm6331920qkb.57.2020.10.26.07.16.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Oct 2020 07:16:01 -0700 (PDT) Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1kX3I1-008TP8-5d; Mon, 26 Oct 2020 11:16:01 -0300 Date: Mon, 26 Oct 2020 11:16:01 -0300 From: Jason Gunthorpe To: Matthew Wilcox Cc: John Hubbard , "Kirill A. Shutemov" , Dave Hansen , Andy Lutomirski , Peter Zijlstra , Paolo Bonzini , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , David Rientjes , Andrea Arcangeli , Kees Cook , Will Drewry , "Edgecombe, Rick P" , "Kleen, Andi" , Liran Alon , Mike Rapoport , x86@kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: Re: [RFCv2 08/16] KVM: Use GUP instead of copy_from/to_user() to access guest memory Message-ID: <20201026141601.GT36674@ziepe.ca> References: <20201020061859.18385-1-kirill.shutemov@linux.intel.com> <20201020061859.18385-9-kirill.shutemov@linux.intel.com> <20201022114946.GR20115@casper.infradead.org> <30ce6691-fd70-76a2-8b61-86d207c88713@nvidia.com> <20201026042158.GN20115@casper.infradead.org> <20201026132830.GQ20115@casper.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201026132830.GQ20115@casper.infradead.org> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 26, 2020 at 01:28:30PM +0000, Matthew Wilcox wrote: > > > It's been five years since DAX was merged, and page pinning still > > > doesn't work. How much longer before the people who are pushing it > > > realise that it's fundamentally flawed? > > > > Is this a separate rant about *only* DAX, or is general RDMA in your sights > > too? :) > > This is a case where it's not RDMA's _fault_ that there's no good API > for it to do what it needs to do. There's a lot of work needed to wean > Linux device drivers off their assumption that there's a struct page > for every byte of memory. People who care seem to have just given up and are using RDMA ODP, so I'm not optimistic this DAX issue will ever be solved. I've also almost removed all the struct page references from this flow in RDMA, so if there is some way that helps it is certainly doable. Regardless of DAX the pinning indication is now being used during fork() for some good reasons, and seems to make sense in other use cases. It just doesn't seem like a way to solve the DAX issue. More or less it seems to mean that pages pinned cannot be write protected and more broadly the kernel should not change the PTEs for those pages independently of the application. ie the more agressive COW on fork() caused data corruption regressions... I wonder if the point here is that some page owners can't/won't support DMA pinning and should just be blocked completely for them. I'd much rather have write access pin_user_pages() just fail than oops the kernel on ext4 owned VMAs, for instance. Jason