Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1329462imu; Wed, 23 Jan 2019 15:05:58 -0800 (PST) X-Google-Smtp-Source: ALg8bN4GkkbfRvTGITJeJEVSnwm5M/VaaLHv2WLiv92orCYxHG4g06QdBX6dCn38ZxWfI8HcHsYe X-Received: by 2002:a63:eb0e:: with SMTP id t14mr3784481pgh.445.1548284758006; Wed, 23 Jan 2019 15:05:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548284757; cv=none; d=google.com; s=arc-20160816; b=RGlOlsML/sC8toNHYalmOgvrqENbuv0hbJsOLKAVWCUTus60GV+6gzjCEfkHDaj/ef A0CDcf0WqGd0PZrNRqR+rwMMsRCJ3PFThQALiIbSyTMok9FZnJtP6q3SSwUgdv96ymGB 8FiUWsX1yargVZd3K0Dl3hsuTJGwnn/YG/fDSss0tx/Qy7uWTlgR2jTfrjXK51x8VS6h p8nW1riTapanEuMILUCqdGEe3IP+/fdArDQJL3x/LeTcNaaDwnf4Q7paxPZbfkT4wh+x ROjXd4hvWMiOrgR+AYF8jp7L5HUSyAXiejvHHFnbYr1ZJE6x74oTAR+kCAekK7+do32o tLsA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=qwSGzKesCYn7vfvgR+dvOX9H2bSZwgjmKuR8G0BeA48=; b=xAqon8T5ISpw7Nkz+kjtwldyCZA0xStyGqlFGdl2hcTz6eahCMvNnUHq5b1lM6XJB0 y8gYn9xWuSxvaXf/lm01T8VO0CCXpsQjGKha+Y4xEC5rBH3sbjUoAS+NPR1fWkhzTJcd 6qVjeRYTfIxpD6uvssEhiqbzFmGrX8cpNvQ8rSPOzP763IHqBxcy0HWnWeJFnTuezOg6 ahExYgPhALdSwwP2HMNdMktti1YAoonjHDiIAYP2Y60UmiId2vsp+mX49xsBwY1pOefn NKphSrS0k7jBYBIK8tqYcySN8g5NnUz+k3BYuv+pxU1m5Pv4rFUoMG/1cSq3Dv9Wi6KF H2uQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j34si19667388pgj.557.2019.01.23.15.05.42; Wed, 23 Jan 2019 15:05:57 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726906AbfAWXFC (ORCPT + 99 others); Wed, 23 Jan 2019 18:05:02 -0500 Received: from mx1.redhat.com ([209.132.183.28]:34390 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726157AbfAWXFC (ORCPT ); Wed, 23 Jan 2019 18:05:02 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 367F7B2755; Wed, 23 Jan 2019 23:05:01 +0000 (UTC) Received: from redhat.com (ovpn-120-127.rdu2.redhat.com [10.10.120.127]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 8DA7C604DB; Wed, 23 Jan 2019 23:04:49 +0000 (UTC) Date: Wed, 23 Jan 2019 18:04:47 -0500 From: Jerome Glisse To: Dan Williams Cc: Ralph Campbell , Jan Kara , Arnd Bergmann , KVM list , Matthew Wilcox , linux-rdma , John Hubbard , Felix Kuehling , Radim =?utf-8?B?S3LEjW3DocWZ?= , Linux Kernel Mailing List , Maling list - DRI developers , Michal Hocko , Linux MM , Jason Gunthorpe , Ross Zwisler , linux-fsdevel , Paolo Bonzini , Andrew Morton , Christian =?iso-8859-1?Q?K=F6nig?= Subject: Re: [PATCH v4 0/9] mmu notifier provide context informations Message-ID: <20190123230447.GC1257@redhat.com> References: <20190123222315.1122-1-jglisse@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Wed, 23 Jan 2019 23:05:01 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 23, 2019 at 02:54:40PM -0800, Dan Williams wrote: > On Wed, Jan 23, 2019 at 2:23 PM wrote: > > > > From: J?r?me Glisse > > > > Hi Andrew, i see that you still have my event patch in you queue [1]. > > This patchset replace that single patch and is broken down in further > > step so that it is easier to review and ascertain that no mistake were > > made during mechanical changes. Here are the step: > > > > Patch 1 - add the enum values > > Patch 2 - coccinelle semantic patch to convert all call site of > > mmu_notifier_range_init to default enum value and also > > to passing down the vma when it is available > > Patch 3 - update many call site to more accurate enum values > > Patch 4 - add the information to the mmu_notifier_range struct > > Patch 5 - helper to test if a range is updated to read only > > > > All the remaining patches are update to various driver to demonstrate > > how this new information get use by device driver. I build tested > > with make all and make all minus everything that enable mmu notifier > > ie building with MMU_NOTIFIER=no. Also tested with some radeon,amd > > gpu and intel gpu. > > > > If they are no objections i believe best plan would be to merge the > > the first 5 patches (all mm changes) through your queue for 5.1 and > > then to delay driver update to each individual driver tree for 5.2. > > This will allow each individual device driver maintainer time to more > > thouroughly test this more then my own testing. > > > > Note that i also intend to use this feature further in nouveau and > > HMM down the road. I also expect that other user like KVM might be > > interested into leveraging this new information to optimize some of > > there secondary page table invalidation. > > "Down the road" users should introduce the functionality they want to > consume. The common concern with preemptively including > forward-looking infrastructure is realizing later that the > infrastructure is not needed, or needs changing. If it has no current > consumer, leave it out. This patchset already show that this is useful, what more can i do ? I know i will use this information, in nouveau for memory policy we allocate our own structure for every vma the GPU ever accessed or that userspace hinted we should set a policy for. Right now with existing mmu notifier i _must_ free those structure because i do not know if the invalidation is an munmap or something else. So i am loosing important informations and unecessarly free struct that i will have to re-allocate just couple jiffies latter. That's one way i am using this. The other way is to optimize GPU page table update just like i am doing with all the patches to RDMA/ODP and various GPU drivers. > > Here is an explaination on the rational for this patchset: > > > > > > CPU page table update can happens for many reasons, not only as a result > > of a syscall (munmap(), mprotect(), mremap(), madvise(), ...) but also > > as a result of kernel activities (memory compression, reclaim, migration, > > ...). > > > > This patch introduce a set of enums that can be associated with each of > > the events triggering a mmu notifier. Latter patches take advantages of > > those enum values. > > > > - UNMAP: munmap() or mremap() > > - CLEAR: page table is cleared (migration, compaction, reclaim, ...) > > - PROTECTION_VMA: change in access protections for the range > > - PROTECTION_PAGE: change in access protections for page in the range > > - SOFT_DIRTY: soft dirtyness tracking > > > > Being able to identify munmap() and mremap() from other reasons why the > > page table is cleared is important to allow user of mmu notifier to > > update their own internal tracking structure accordingly (on munmap or > > mremap it is not longer needed to track range of virtual address as it > > becomes invalid). > > The only context information consumed in this patch set is > MMU_NOTIFY_PROTECTION_VMA. > > What is the practical benefit of these "optimize out the case when a > range is updated to read only" optimizations? Any numbers to show this > is worth the code thrash? It depends on the workload for instance if you map to RDMA a file read only like a log file for export, all write back that would disrupt the RDMA mapping can be optimized out. See above for more reasons why it is beneficial (knowing when it is an munmap/mremap versus something else). I would have not thought that passing down information as something that controversial. Hopes this help you see the benefit of this. Cheers, J?r?me