Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp4159615pxk; Tue, 8 Sep 2020 12:12:03 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwCGZRp318wjMIVKpux+KQ1WwG6h3e4qhzrDLPAGNuDV5Pr8+/m0XO14t9d+CLKBI/hf+Fr X-Received: by 2002:aa7:d40f:: with SMTP id z15mr474955edq.247.1599592323589; Tue, 08 Sep 2020 12:12:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599592323; cv=none; d=google.com; s=arc-20160816; b=VMZGXPyucnplQzOyO5IVMrWaSBGvcpAgDM0j3hFdNf9yrOl2SeXazRXQhztYT+KguD u1Etxm4p/OpwX/48EIRibeZZ1pDgiO6tc2lqWvMDG1qWSnLrfuFeCULIKDHOk0XnSwgA bb8L8GPFaVxKYivHtikXL1Y4njUco3eENfvIQ25tbRccagD5w0HEJYTp/gOHf9KY2x0M /6j0vPLox30wJSvIX+OBIPe8OyS0Sy7rqbgUpQQwj3sS+K/RUTxzd2w51W/JgR3jSpZY GwcnHuDHwxsr7MlyeT7NZWlvHqe3FCgTeve420QbKHvxHY0h1AAgRCbzKdMUXRKotGJP 3JaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=LxYSHzL2QoZxC8ln0qZz+Hcbqn92btKlcN0MNHcVi4Y=; b=pCxzqwQr2iwsS6iyyno7ZFx2g2WKifP+Gufz+NjMp18Co18F54p0AUz0K7Uo4fbWqu RQvp+fV9h8fN6CPwfwQpnOo+xXTHK8K+InLv2JJw4GPoI1ZGbBqUieOLjWVerGvuVhCv CBoHlsUTuyBXco00FGtJtKEHNhhane1gzBptx2BLYSHJ6aVWXhFpP0ASB667PdPaQxTY LD7geWCFAe6ITDAhTkSJEoC8DFPmmdnzGW80B8aZFnulXttwdbsQ0ww+7Pp/lGDE1m+6 UcyDM8oMsjReu++YdxxnFyLtVOACVst7e/SNQi7vPE96fzc/YA9iHVFERcjPk3HFIDCo kjPA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=rF8KUQFs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id re18si2874859ejb.684.2020.09.08.12.11.41; Tue, 08 Sep 2020 12:12:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=rF8KUQFs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731761AbgIHTJP (ORCPT + 99 others); Tue, 8 Sep 2020 15:09:15 -0400 Received: from mail.kernel.org ([198.145.29.99]:51858 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731243AbgIHQGG (ORCPT ); Tue, 8 Sep 2020 12:06:06 -0400 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 5E4E322447; Tue, 8 Sep 2020 15:46:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599579985; bh=sszxvglWinBGeqDJ77mOdr9X2VXjcMnUHzoXY1fqceI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rF8KUQFsDwvFVi+GnRg6XMrHkCvi9o5sHKOxzl6/TFvEd2xKqfLqJjKQmUPUejUdC qhB0arMBVA611EWx8nSfWGyDgpd8/8JKWeDD+nUl6q2xuTqZLRfqQCSFsmKQdbg1FU iTmWBncSDGEKYPvscNSMcNqhdD+c4WvcCa7kgmUM= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Peter Xu , Alex Williamson , Ajay Kaher , Sasha Levin Subject: [PATCH 5.4 074/129] vfio-pci: Fault mmaps to enable vma tracking Date: Tue, 8 Sep 2020 17:25:15 +0200 Message-Id: <20200908152233.397144995@linuxfoundation.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200908152229.689878733@linuxfoundation.org> References: <20200908152229.689878733@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Ajay Kaher commit 11c4cd07ba111a09f49625f9e4c851d83daf0a22 upstream. Rather than calling remap_pfn_range() when a region is mmap'd, setup a vm_ops handler to support dynamic faulting of the range on access. This allows us to manage a list of vmas actively mapping the area that we can later use to invalidate those mappings. The open callback invalidates the vma range so that all tracking is inserted in the fault handler and removed in the close handler. Reviewed-by: Peter Xu Signed-off-by: Alex Williamson Signed-off-by: Ajay Kaher Signed-off-by: Sasha Levin --- drivers/vfio/pci/vfio_pci.c | 76 ++++++++++++++++++++++++++++- drivers/vfio/pci/vfio_pci_private.h | 7 +++ 2 files changed, 81 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index 02206162eaa9e..da1d1eac0def1 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -1192,6 +1192,70 @@ static ssize_t vfio_pci_write(void *device_data, const char __user *buf, return vfio_pci_rw(device_data, (char __user *)buf, count, ppos, true); } +static int vfio_pci_add_vma(struct vfio_pci_device *vdev, + struct vm_area_struct *vma) +{ + struct vfio_pci_mmap_vma *mmap_vma; + + mmap_vma = kmalloc(sizeof(*mmap_vma), GFP_KERNEL); + if (!mmap_vma) + return -ENOMEM; + + mmap_vma->vma = vma; + + mutex_lock(&vdev->vma_lock); + list_add(&mmap_vma->vma_next, &vdev->vma_list); + mutex_unlock(&vdev->vma_lock); + + return 0; +} + +/* + * Zap mmaps on open so that we can fault them in on access and therefore + * our vma_list only tracks mappings accessed since last zap. + */ +static void vfio_pci_mmap_open(struct vm_area_struct *vma) +{ + zap_vma_ptes(vma, vma->vm_start, vma->vm_end - vma->vm_start); +} + +static void vfio_pci_mmap_close(struct vm_area_struct *vma) +{ + struct vfio_pci_device *vdev = vma->vm_private_data; + struct vfio_pci_mmap_vma *mmap_vma; + + mutex_lock(&vdev->vma_lock); + list_for_each_entry(mmap_vma, &vdev->vma_list, vma_next) { + if (mmap_vma->vma == vma) { + list_del(&mmap_vma->vma_next); + kfree(mmap_vma); + break; + } + } + mutex_unlock(&vdev->vma_lock); +} + +static vm_fault_t vfio_pci_mmap_fault(struct vm_fault *vmf) +{ + struct vm_area_struct *vma = vmf->vma; + struct vfio_pci_device *vdev = vma->vm_private_data; + + if (vfio_pci_add_vma(vdev, vma)) + return VM_FAULT_OOM; + + if (remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff, + vma->vm_end - vma->vm_start, vma->vm_page_prot)) + return VM_FAULT_SIGBUS; + + return VM_FAULT_NOPAGE; +} + +static const struct vm_operations_struct vfio_pci_mmap_ops = { + .open = vfio_pci_mmap_open, + .close = vfio_pci_mmap_close, + .fault = vfio_pci_mmap_fault, +}; + static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma) { struct vfio_pci_device *vdev = device_data; @@ -1250,8 +1314,14 @@ static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma) vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); vma->vm_pgoff = (pci_resource_start(pdev, index) >> PAGE_SHIFT) + pgoff; - return remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff, - req_len, vma->vm_page_prot); + /* + * See remap_pfn_range(), called from vfio_pci_fault() but we can't + * change vm_flags within the fault handler. Set them now. + */ + vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP; + vma->vm_ops = &vfio_pci_mmap_ops; + + return 0; } static void vfio_pci_request(void *device_data, unsigned int count) @@ -1327,6 +1397,8 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) spin_lock_init(&vdev->irqlock); mutex_init(&vdev->ioeventfds_lock); INIT_LIST_HEAD(&vdev->ioeventfds_list); + mutex_init(&vdev->vma_lock); + INIT_LIST_HEAD(&vdev->vma_list); ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev); if (ret) { diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h index ee6ee91718a4d..898844894ed85 100644 --- a/drivers/vfio/pci/vfio_pci_private.h +++ b/drivers/vfio/pci/vfio_pci_private.h @@ -84,6 +84,11 @@ struct vfio_pci_reflck { struct mutex lock; }; +struct vfio_pci_mmap_vma { + struct vm_area_struct *vma; + struct list_head vma_next; +}; + struct vfio_pci_device { struct pci_dev *pdev; void __iomem *barmap[PCI_STD_RESOURCE_END + 1]; @@ -122,6 +127,8 @@ struct vfio_pci_device { struct list_head dummy_resources_list; struct mutex ioeventfds_lock; struct list_head ioeventfds_list; + struct mutex vma_lock; + struct list_head vma_list; }; #define is_intx(vdev) (vdev->irq_type == VFIO_PCI_INTX_IRQ_INDEX) -- 2.25.1