Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752778AbdFOOwk (ORCPT ); Thu, 15 Jun 2017 10:52:40 -0400 Received: from mga14.intel.com ([192.55.52.115]:1605 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751673AbdFOOwi (ORCPT ); Thu, 15 Jun 2017 10:52:38 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.39,343,1493708400"; d="scan'208";a="868300186" From: "Kirill A. Shutemov" To: Andrew Morton , Vlastimil Babka , Vineet Gupta , Russell King , Will Deacon , Catalin Marinas , Ralf Baechle , "David S. Miller" , "Aneesh Kumar K . V" , Martin Schwidefsky , Heiko Carstens , Andrea Arcangeli Cc: linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv2 2/3] mm: Do not loose dirty and access bits in pmdp_invalidate() Date: Thu, 15 Jun 2017 17:52:23 +0300 Message-Id: <20170615145224.66200-3-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20170615145224.66200-1-kirill.shutemov@linux.intel.com> References: <20170615145224.66200-1-kirill.shutemov@linux.intel.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2313 Lines: 63 Vlastimil noted that pmdp_invalidate() is not atomic and we can loose dirty and access bits if CPU sets them after pmdp dereference, but before set_pmd_at(). The bug doesn't lead to user-visible misbehaviour in current kernel. Loosing access bit can lead to sub-optimal reclaim behaviour for THP, but nothing destructive. Loosing dirty bit is not a big deal too: we would make page dirty unconditionally on splitting huge page. The fix is critical for future work on THP: both huge-ext4 and THP swap out rely on proper dirty tracking. The patch change pmdp_invalidate() to make the entry non-present atomically and return previous value of the entry. This value can be used to check if CPU set dirty/accessed bits under us. Signed-off-by: Kirill A. Shutemov Reported-by: Vlastimil Babka --- include/asm-generic/pgtable.h | 2 +- mm/pgtable-generic.c | 9 +++++---- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 7dfa767dc680..ece5e399567a 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -309,7 +309,7 @@ extern pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp); #endif #ifndef __HAVE_ARCH_PMDP_INVALIDATE -extern void pmdp_invalidate(struct vm_area_struct *vma, unsigned long address, +extern pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp); #endif diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c index c99d9512a45b..148fe36f61a7 100644 --- a/mm/pgtable-generic.c +++ b/mm/pgtable-generic.c @@ -179,12 +179,13 @@ pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp) #endif #ifndef __HAVE_ARCH_PMDP_INVALIDATE -void pmdp_invalidate(struct vm_area_struct *vma, unsigned long address, +pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp) { - pmd_t entry = *pmdp; - set_pmd_at(vma->vm_mm, address, pmdp, pmd_mknotpresent(entry)); - flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE); + pmd_t old = pmdp_establish(pmdp, pmd_mknotpresent(*pmdp)); + if (pmd_present(old)) + flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE); + return old; } #endif -- 2.11.0