Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752661AbdFOIrI (ORCPT ); Thu, 15 Jun 2017 04:47:08 -0400 Received: from mail-lf0-f65.google.com ([209.85.215.65]:36759 "EHLO mail-lf0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752252AbdFOIrE (ORCPT ); Thu, 15 Jun 2017 04:47:04 -0400 Date: Thu, 15 Jun 2017 11:46:56 +0300 From: "Kirill A. Shutemov" To: Andrea Arcangeli Cc: Martin Schwidefsky , "Kirill A. Shutemov" , Andrew Morton , Vlastimil Babka , Vineet Gupta , Russell King , Will Deacon , Catalin Marinas , Ralf Baechle , "David S. Miller" , Heiko Carstens , "Aneesh Kumar K . V" , linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 3/3] mm, thp: Do not loose dirty bit in __split_huge_pmd_locked() Message-ID: <20170615084656.bqevrlwtyyyxdbmd@node.shutemov.name> References: <20170614135143.25068-1-kirill.shutemov@linux.intel.com> <20170614135143.25068-4-kirill.shutemov@linux.intel.com> <20170614161857.69d54338@mschwideX1> <20170614153131.GC5847@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170614153131.GC5847@redhat.com> User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1787 Lines: 46 On Wed, Jun 14, 2017 at 05:31:31PM +0200, Andrea Arcangeli wrote: > Hello, > > On Wed, Jun 14, 2017 at 04:18:57PM +0200, Martin Schwidefsky wrote: > > Could we change pmdp_invalidate to make it return the old pmd entry? > > That to me seems the simplest fix to avoid losing the dirty bit. > > I earlier suggested to replace pmdp_invalidate with something like > old_pmd = pmdp_establish(pmd_mknotpresent(pmd)) (then tlb flush could > then be conditional to the old pmd being present). Making > pmdp_invalidate return the old pmd entry would be mostly equivalent to > that. > > The advantage of not changing pmdp_invalidate is that we could skip a > xchg which is more costly in __split_huge_pmd_locked and > madvise_free_huge_pmd so perhaps there's a point to keep a variant of > pmdp_invalidate that doesn't use xchg internally (and in turn can't > return the old pmd value atomically). > > If we don't want new messy names like pmdp_establish we could have a > __pmdp_invalidate that returns void, and pmdp_invalidate that returns > the old pmd and uses xchg (and it'd also be backwards compatible as > far as the callers are concerned). So those places that don't need the > old value returned and can skip the xchg, could simply > s/pmdp_invalidate/__pmdp_invalidate/ to optimize. We have few pmdp_invalidate() callers: - clear_soft_dirty_pmd(); - madvise_free_huge_pmd(); - change_huge_pmd(); - __split_huge_pmd_locked(); Only madvise_free_huge_pmd() doesn't care about old pmd. __split_huge_pmd_locked() actually needs to check dirty after pmdp_invalidate(), see patch 3/3 of the patchset. I don't think it worth introduce one more primitive only for madvise_free_huge_pmd(). I'll stick with single pmdp_invalidate() that returns old value. -- Kirill A. Shutemov