Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935208AbdDFOWR (ORCPT ); Thu, 6 Apr 2017 10:22:17 -0400 Received: from mx1.redhat.com ([209.132.183.28]:30680 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934593AbdDFOWH (ORCPT ); Thu, 6 Apr 2017 10:22:07 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 590062E6046 Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=rkrcmar@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 590062E6046 Date: Thu, 6 Apr 2017 16:22:02 +0200 From: Radim =?utf-8?B?S3LEjW3DocWZ?= To: Filippo Sironi Cc: sironi@amazon.com, Anthony Liguori , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] x86, kvm: Handle PFNs outside of kernel reach when touching GPTEs Message-ID: <20170406142201.GA2817@potion> References: <1491397622-16665-1-git-send-email-sironi@amazon.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1491397622-16665-1-git-send-email-sironi@amazon.de> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Thu, 06 Apr 2017 14:22:07 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2816 Lines: 88 2017-04-05 15:07+0200, Filippo Sironi: > cmpxchg_gpte() calls get_user_pages_fast() to retrieve the number of > pages and the respective struct pages for mapping in the kernel virtual > address space. > This doesn't work if get_user_pages_fast() is invoked with a userspace > virtual address that's backed by PFNs outside of kernel reach (e.g., > when limiting the kernel memory with mem= in the command line and using > /dev/mem to map memory). > > If get_user_pages_fast() fails, look up the VMA that backs the userspace > virtual address, compute the PFN and the physical address, and map it in > the kernel virtual address space with memremap(). What is the reason for a configuration that voluntarily restricts access to memory that it needs? > Signed-off-by: Filippo Sironi > Cc: Anthony Liguori > Cc: kvm@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > --- > diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h > @@ -147,15 +147,36 @@ static int FNAME(cmpxchg_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > struct page *page; > > npages = get_user_pages_fast((unsigned long)ptep_user, 1, 1, &page); > - /* Check if the user is doing something meaningless. */ > - if (unlikely(npages != 1)) > - return -EFAULT; > - > - table = kmap_atomic(page); > - ret = CMPXCHG(&table[index], orig_pte, new_pte); > - kunmap_atomic(table); > - > - kvm_release_page_dirty(page); > + if (likely(npages == 1)) { > + table = kmap_atomic(page); > + ret = CMPXCHG(&table[index], orig_pte, new_pte); > + kunmap_atomic(table); > + > + kvm_release_page_dirty(page); > + } else { > + struct vm_area_struct *vma; > + unsigned long vaddr = (unsigned long)ptep_user & PAGE_MASK; > + unsigned long pfn; > + unsigned long paddr; > + > + down_read(¤t->mm->mmap_sem); > + vma = find_vma_intersection(current->mm, vaddr, > + vaddr + PAGE_SIZE); Hm, with the argument order like this, we check that vaddr < vma->vm_end && vaddr + PAGE_SIZE > vma->vm_start but shouldn't we actually check that the whole page is there, i.e. vaddr + PAGE_SIZE < vma->vm_end && vaddr > vma->vm_start ? Thanks. > + if (!vma || !(vma->vm_flags & VM_PFNMAP)) { > + up_read(¤t->mm->mmap_sem); > + return -EFAULT; > + } > + pfn = ((vaddr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff; > + paddr = pfn << PAGE_SHIFT; > + table = memremap(paddr, PAGE_SIZE, MEMREMAP_WB); (I don't undestand why there isn't a wrapper for this ... Looks like we're doing something unexpected.) > + if (!table) { > + up_read(¤t->mm->mmap_sem); > + return -EFAULT; > + } > + ret = CMPXCHG(&table[index], orig_pte, new_pte); > + memunmap(table); > + up_read(¤t->mm->mmap_sem); > + } > > return (ret != orig_pte); > } > -- > 2.7.4 >