Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753075AbcKHORJ (ORCPT ); Tue, 8 Nov 2016 09:17:09 -0500 Received: from hqemgate15.nvidia.com ([216.228.121.64]:3808 "EHLO hqemgate15.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751423AbcKHORG (ORCPT ); Tue, 8 Nov 2016 09:17:06 -0500 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Tue, 08 Nov 2016 06:17:04 -0800 Subject: Re: [PATCH v11 09/22] vfio iommu type1: Add task structure to vfio_dma To: Alex Williamson References: <1478293856-8191-1-git-send-email-kwankhede@nvidia.com> <1478293856-8191-10-git-send-email-kwankhede@nvidia.com> <20161107140348.55176252@t450s.home> CC: , , , , , , , , X-Nvconfidentiality: public From: Kirti Wankhede Message-ID: <71e24995-1678-7e43-90fa-7798cfcdebbc@nvidia.com> Date: Tue, 8 Nov 2016 19:43:25 +0530 MIME-Version: 1.0 In-Reply-To: <20161107140348.55176252@t450s.home> X-Originating-IP: [10.24.216.210] X-ClientProxiedBy: DRHKMAIL101.nvidia.com (10.25.59.15) To bgmail102.nvidia.com (10.25.59.11) Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3215 Lines: 117 On 11/8/2016 2:33 AM, Alex Williamson wrote: > On Sat, 5 Nov 2016 02:40:43 +0530 > Kirti Wankhede wrote: > ... >> static int vfio_dma_do_map(struct vfio_iommu *iommu, >> struct vfio_iommu_type1_dma_map *map) >> { >> dma_addr_t iova = map->iova; >> unsigned long vaddr = map->vaddr; >> size_t size = map->size; >> - long npage; >> int ret = 0, prot = 0; >> uint64_t mask; >> struct vfio_dma *dma; >> - unsigned long pfn; >> + struct vfio_addr_space *addr_space; >> + struct mm_struct *mm; >> + bool free_addr_space_on_err = false; >> >> /* Verify that none of our __u64 fields overflow */ >> if (map->size != size || map->vaddr != vaddr || map->iova != iova) >> @@ -608,47 +685,56 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, >> mutex_lock(&iommu->lock); >> >> if (vfio_find_dma(iommu, iova, size)) { >> - mutex_unlock(&iommu->lock); >> - return -EEXIST; >> + ret = -EEXIST; >> + goto do_map_err; >> + } >> + >> + mm = get_task_mm(current); >> + if (!mm) { >> + ret = -ENODEV; > > -EFAULT? > -ENODEV return is in original code from vfio_pin_pages() if (!current->mm) return -ENODEV; Once I thought of changing it to -EFAULT, but then again changed to -ENODEV to be consistent with original error code. Should I still change this return to -EFAULT? >> + goto do_map_err; >> + } >> + >> + addr_space = vfio_find_addr_space(iommu, mm); >> + if (addr_space) { >> + atomic_inc(&addr_space->ref_count); >> + mmput(mm); >> + } else { >> + addr_space = kzalloc(sizeof(*addr_space), GFP_KERNEL); >> + if (!addr_space) { >> + ret = -ENOMEM; >> + goto do_map_err; >> + } >> + addr_space->mm = mm; >> + atomic_set(&addr_space->ref_count, 1); >> + list_add(&addr_space->next, &iommu->addr_space_list); >> + free_addr_space_on_err = true; >> } >> >> dma = kzalloc(sizeof(*dma), GFP_KERNEL); >> if (!dma) { >> - mutex_unlock(&iommu->lock); >> - return -ENOMEM; >> + if (free_addr_space_on_err) { >> + mmput(mm); >> + list_del(&addr_space->next); >> + kfree(addr_space); >> + } >> + ret = -ENOMEM; >> + goto do_map_err; >> } >> >> dma->iova = iova; >> dma->vaddr = vaddr; >> dma->prot = prot; >> + dma->addr_space = addr_space; >> + get_task_struct(current); >> + dma->task = current; >> + dma->mlock_cap = capable(CAP_IPC_LOCK); > > > How do you reason we can cache this? Does the fact that the process > had this capability at the time that it did a DMA_MAP imply that it > necessarily still has this capability when an external user (vendor > driver) tries to pin pages? I don't see how we can make that > assumption. > > Will process change MEMLOCK limit at runtime? I think it shouldn't, correct me if I'm wrong. QEMU doesn't do that, right? The function capable() determines current task's capability. But when vfio_pin_pages() is called, it could come from other task but pages are pinned from address space of task who mapped it. So we can't use capable() in vfio_pin_pages() If this capability shouldn't be cached, we have to use has_capability() with dma->task as argument in vfio_pin_pages() bool has_capability(struct task_struct *t, int cap) Thanks, Kirti