Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751355AbdH3Uzm (ORCPT ); Wed, 30 Aug 2017 16:55:42 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43246 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750968AbdH3Uzk (ORCPT ); Wed, 30 Aug 2017 16:55:40 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 3142A83F3E Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=aarcange@redhat.com Date: Wed, 30 Aug 2017 22:55:38 +0200 From: Andrea Arcangeli To: Nadav Amit Cc: Jerome Glisse , Linux Kernel Mailing List , "open list:MEMORY MANAGEMENT" , Dan Williams , Ross Zwisler , Linus Torvalds , Bernhard Held , Adam Borowski , Radim =?utf-8?B?S3LEjW3DocWZ?= , Wanpeng Li , Paolo Bonzini , Takashi Iwai , Mike Galbraith , "Kirill A . Shutemov" , axie , Andrew Morton Subject: Re: [PATCH 02/13] mm/rmap: update to new mmu_notifier semantic Message-ID: <20170830205538.GH13559@redhat.com> References: <20170829235447.10050-1-jglisse@redhat.com> <20170829235447.10050-3-jglisse@redhat.com> <6D58FBE4-5D03-49CC-AAFF-3C1279A5A849@gmail.com> <20170830172747.GE13559@redhat.com> <20170830182013.GD2386@redhat.com> <180A2625-E3AB-44BF-A3B7-E687299B9DA9@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <180A2625-E3AB-44BF-A3B7-E687299B9DA9@gmail.com> User-Agent: Mutt/1.8.3 (2017-05-23) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Wed, 30 Aug 2017 20:55:40 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1386 Lines: 27 On Wed, Aug 30, 2017 at 11:40:08AM -0700, Nadav Amit wrote: > The mmu_notifier users would have to be aware that invalidations may be > deferred. If they perform their ``invalidations’’ unconditionally, it may be > ok. If the notifier users avoid invalidations based on the PTE in the > secondary page-table, it can be a problem. invalidate_page was always deferred post PT lock release. This ->invalidate_range post PT lock release, is not a new thing, we're still back to squre one to find out if invalidate_page callout after PT lock release has always been broken here or not. > On another note, you may want to consider combining the secondary page-table > mechanisms with the existing TLB-flush mechanisms. Right now, it is > partially done: tlb_flush_mmu_tlbonly(), for example, calls > mmu_notifier_invalidate_range(). However, tlb_gather_mmu() does not call > mmu_notifier_invalidate_range_start(). If you implement ->invalidate_range_start you don't care about tlb gather at all and you must not implement ->invalidate_range. > This can also prevent all kind of inconsistencies, and potential bugs. For > instance, clear_refs_write() calls mmu_notifier_invalidate_range_start/end() > but in between there is no call for mmu_notifier_invalidate_range(). It's done in mmu_notifier_invalidate_range_end which is again fully equivalent except run after PT lock release.