Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760127Ab3EANMI (ORCPT ); Wed, 1 May 2013 09:12:08 -0400 Received: from relay2.sgi.com ([192.48.179.30]:54783 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1755369Ab3EANMD (ORCPT ); Wed, 1 May 2013 09:12:03 -0400 To: linux-kernel@vger.kernel.org Subject: [PATCH] mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas Cc: akpm@linux-foundation.org, mgorman@suse.de, aarcange@redhat.com, dave.hansen@intel.com, dsterba@suse.cz, hannes@cmpxchg.org, kosaki.motohiro@gmail.com, kirill.shutemov@linux.intel.com, mpm@selenic.com, n-horiguchi@ah.jp.nec.com, rdunlap@infradead.org Message-Id: From: Cliff Wickman Date: Wed, 01 May 2013 08:12:01 -0500 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4167 Lines: 125 This patch replaces "[PATCH] fs/proc: smaps should avoid VM_PFNMAP areas". /proc//smaps and similar walks through a user page table should not be looking at VM_PFNMAP areas. Certain tests in walk_page_range() (specifically split_huge_page_pmd()) assume that all the mapped PFN's are backed with page structures. And this is not usually true for VM_PFNMAP areas. This can result in panics on kernel page faults when attempting to address those page structures. There are a half dozen callers of walk_page_range() that walk through a task's entire page table (as N. Horiguchi pointed out). So rather than change all of them, this patch changes just walk_page_range() to ignore VM_PFNMAP areas. The logic of hugetlb_vma() is moved back into walk_page_range(), as we want to test any vma in the range. VM_PFNMAP areas are used by: - graphics memory manager gpu/drm/drm_gem.c - global reference unit sgi-gru/grufile.c - sgi special memory char/mspec.c - and probably several out-of-tree modules I'm copying everyone who has changed this file recently, in case there is some reason that I am not aware of to provide /proc//smaps|clear_refs|maps|numa_maps for these VM_PFNMAP areas. Signed-off-by: Cliff Wickman --- mm/pagewalk.c | 60 +++++++++++++++++++++++++++++----------------------------- 1 file changed, 31 insertions(+), 29 deletions(-) Index: linux/mm/pagewalk.c =================================================================== --- linux.orig/mm/pagewalk.c +++ linux/mm/pagewalk.c @@ -127,22 +127,6 @@ static int walk_hugetlb_range(struct vm_ return 0; } -static struct vm_area_struct* hugetlb_vma(unsigned long addr, struct mm_walk *walk) -{ - struct vm_area_struct *vma; - - /* We don't need vma lookup at all. */ - if (!walk->hugetlb_entry) - return NULL; - - VM_BUG_ON(!rwsem_is_locked(&walk->mm->mmap_sem)); - vma = find_vma(walk->mm, addr); - if (vma && vma->vm_start <= addr && is_vm_hugetlb_page(vma)) - return vma; - - return NULL; -} - #else /* CONFIG_HUGETLB_PAGE */ static struct vm_area_struct* hugetlb_vma(unsigned long addr, struct mm_walk *walk) { @@ -200,28 +184,46 @@ int walk_page_range(unsigned long addr, pgd = pgd_offset(walk->mm, addr); do { - struct vm_area_struct *vma; + struct vm_area_struct *vma = NULL; next = pgd_addr_end(addr, end); /* - * handle hugetlb vma individually because pagetable walk for - * the hugetlb page is dependent on the architecture and - * we can't handled it in the same manner as non-huge pages. + * Check any special vma's within this range. */ - vma = hugetlb_vma(addr, walk); + VM_BUG_ON(!rwsem_is_locked(&walk->mm->mmap_sem)); + vma = find_vma(walk->mm, addr); if (vma) { - if (vma->vm_end < next) + /* + * There are no page structures backing a VM_PFNMAP + * range, so allow no split_huge_page_pmd(). + */ + if (vma->vm_flags & VM_PFNMAP) { next = vma->vm_end; + pgd = pgd_offset(walk->mm, next); + continue; + } /* - * Hugepage is very tightly coupled with vma, so - * walk through hugetlb entries within a given vma. + * Handle hugetlb vma individually because pagetable + * walk for the hugetlb page is dependent on the + * architecture and we can't handled it in the same + * manner as non-huge pages. */ - err = walk_hugetlb_range(vma, addr, next, walk); - if (err) - break; - pgd = pgd_offset(walk->mm, next); - continue; + if (walk->hugetlb_entry && (vma->vm_start <= addr) && + is_vm_hugetlb_page(vma)) { + if (vma->vm_end < next) + next = vma->vm_end; + /* + * Hugepage is very tightly coupled with vma, + * so walk through hugetlb entries within a + * given vma. + */ + err = walk_hugetlb_range(vma, addr, next, walk); + if (err) + break; + pgd = pgd_offset(walk->mm, next); + continue; + } } if (pgd_none_or_clear_bad(pgd)) { -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/