Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751724AbdFIOVZ (ORCPT ); Fri, 9 Jun 2017 10:21:25 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:43866 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751679AbdFIOVW (ORCPT ); Fri, 9 Jun 2017 10:21:22 -0400 From: Laurent Dufour To: paulmck@linux.vnet.ibm.com, peterz@infradead.org, akpm@linux-foundation.org, kirill@shutemov.name, ak@linux.intel.com, mhocko@kernel.org, dave@stgolabs.net, jack@suse.cz, Matthew Wilcox Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, haren@linux.vnet.ibm.com, khandual@linux.vnet.ibm.com, npiggin@gmail.com, bsingharora@gmail.com Subject: [RFC v4 01/20] mm: Dont assume page-table invariance during faults Date: Fri, 9 Jun 2017 16:20:50 +0200 X-Mailer: git-send-email 2.7.4 In-Reply-To: <1497018069-17790-1-git-send-email-ldufour@linux.vnet.ibm.com> References: <1497018069-17790-1-git-send-email-ldufour@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 17060914-0040-0000-0000-000003C652AA X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17060914-0041-0000-0000-0000205E6D0C Message-Id: <1497018069-17790-2-git-send-email-ldufour@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-06-09_07:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=2 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1706090251 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1994 Lines: 60 From: Peter Zijlstra One of the side effects of speculating on faults (without holding mmap_sem) is that we can race with free_pgtables() and therefore we cannot assume the page-tables will stick around. Remove the relyance on the pte pointer. Signed-off-by: Peter Zijlstra (Intel) --- mm/memory.c | 27 --------------------------- 1 file changed, 27 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 2e65df1831d9..fd952f05e016 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2102,30 +2102,6 @@ int apply_to_page_range(struct mm_struct *mm, unsigned long addr, } EXPORT_SYMBOL_GPL(apply_to_page_range); -/* - * handle_pte_fault chooses page fault handler according to an entry which was - * read non-atomically. Before making any commitment, on those architectures - * or configurations (e.g. i386 with PAE) which might give a mix of unmatched - * parts, do_swap_page must check under lock before unmapping the pte and - * proceeding (but do_wp_page is only called after already making such a check; - * and do_anonymous_page can safely check later on). - */ -static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd, - pte_t *page_table, pte_t orig_pte) -{ - int same = 1; -#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT) - if (sizeof(pte_t) > sizeof(unsigned long)) { - spinlock_t *ptl = pte_lockptr(mm, pmd); - spin_lock(ptl); - same = pte_same(*page_table, orig_pte); - spin_unlock(ptl); - } -#endif - pte_unmap(page_table); - return same; -} - static inline void cow_user_page(struct page *dst, struct page *src, unsigned long va, struct vm_area_struct *vma) { debug_dma_assert_idle(src); @@ -2682,9 +2658,6 @@ int do_swap_page(struct vm_fault *vmf) int exclusive = 0; int ret = 0; - if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte)) - goto out; - entry = pte_to_swp_entry(vmf->orig_pte); if (unlikely(non_swap_entry(entry))) { if (is_migration_entry(entry)) { -- 2.7.4