Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936095AbcKPDn5 (ORCPT ); Tue, 15 Nov 2016 22:43:57 -0500 Received: from hqemgate16.nvidia.com ([216.228.121.65]:5083 "EHLO hqemgate16.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933423AbcKPDny (ORCPT ); Tue, 15 Nov 2016 22:43:54 -0500 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Tue, 15 Nov 2016 19:43:49 -0800 Subject: Re: [PATCH v13 11/22] vfio iommu: Add blocking notifier to notify DMA_UNMAP To: Alex Williamson References: <1479223805-22895-1-git-send-email-kwankhede@nvidia.com> <1479223805-22895-12-git-send-email-kwankhede@nvidia.com> <20161115151950.1e8ab7d6@t450s.home> <20161115201612.103893d7@t450s.home> <20161115202522.16d1990e@t450s.home> CC: , , , , , , , , X-Nvconfidentiality: public From: Kirti Wankhede Message-ID: <473d10c5-b2cb-e976-a923-b5add22bcde6@nvidia.com> Date: Wed, 16 Nov 2016 09:13:37 +0530 MIME-Version: 1.0 In-Reply-To: <20161115202522.16d1990e@t450s.home> X-Originating-IP: [10.24.70.148] X-ClientProxiedBy: DRBGMAIL101.nvidia.com (10.18.16.20) To bgmail102.nvidia.com (10.25.59.11) Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4676 Lines: 141 On 11/16/2016 8:55 AM, Alex Williamson wrote: > On Tue, 15 Nov 2016 20:16:12 -0700 > Alex Williamson wrote: > >> On Wed, 16 Nov 2016 08:16:15 +0530 >> Kirti Wankhede wrote: >> >>> On 11/16/2016 3:49 AM, Alex Williamson wrote: >>>> On Tue, 15 Nov 2016 20:59:54 +0530 >>>> Kirti Wankhede wrote: >>>> >>> ... >>> >>>>> @@ -854,7 +857,28 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu, >>>>> */ >>>>> if (dma->task->mm != current->mm) >>>>> break; >>>>> + >>>>> unmapped += dma->size; >>>>> + >>>>> + if (iommu->external_domain && !RB_EMPTY_ROOT(&dma->pfn_list)) { >>>>> + struct vfio_iommu_type1_dma_unmap nb_unmap; >>>>> + >>>>> + nb_unmap.iova = dma->iova; >>>>> + nb_unmap.size = dma->size; >>>>> + >>>>> + /* >>>>> + * Notifier callback would call vfio_unpin_pages() which >>>>> + * would acquire iommu->lock. Release lock here and >>>>> + * reacquire it again. >>>>> + */ >>>>> + mutex_unlock(&iommu->lock); >>>>> + blocking_notifier_call_chain(&iommu->notifier, >>>>> + VFIO_IOMMU_NOTIFY_DMA_UNMAP, >>>>> + &nb_unmap); >>>>> + mutex_lock(&iommu->lock); >>>>> + if (WARN_ON(!RB_EMPTY_ROOT(&dma->pfn_list))) >>>>> + break; >>>>> + } >>>> >>>> >>>> Why exactly do we need to notify per vfio_dma rather than per unmap >>>> request? If we do the latter we can send the notify first, limiting us >>>> to races where a page is pinned between the notify and the locking, >>>> whereas here, even our dma pointer is suspect once we re-acquire the >>>> lock, we don't technically know if another unmap could have removed >>>> that already. Perhaps something like this (untested): >>>> >>> >>> There are checks to validate unmap request, like v2 check and who is >>> calling unmap and is it allowed for that task to unmap. Before these >>> checks its not sure that unmap region range which asked for would be >>> unmapped all. Notify call should be at the place where its sure that the >>> range provided to notify call is definitely going to be removed. My >>> change do that. >> >> Ok, but that does solve the problem. What about this (untested): > > s/does/does not/ > > BTW, I like how the retries here fill the gap in my previous proposal > where we could still race re-pinning. We've given it an honest shot or > someone is not participating if we've retried 10 times. I don't > understand why the test for iommu->external_domain was there, clearly > if the list is not empty, we need to notify. Thanks, > Ok. Retry is good to give a chance to unpin all. But is it really required to use BUG_ON() that would panic the host. I think WARN_ON should be fine and then when container is closed or when the last group is removed from the container, vfio_iommu_type1_release() is called and we have a chance to unpin it all. Thanks, Kirti > Alex > >> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c >> index ee9a680..50cafdf 100644 >> --- a/drivers/vfio/vfio_iommu_type1.c >> +++ b/drivers/vfio/vfio_iommu_type1.c >> @@ -782,9 +782,9 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu, >> struct vfio_iommu_type1_dma_unmap *unmap) >> { >> uint64_t mask; >> - struct vfio_dma *dma; >> + struct vfio_dma *dma, *dma_last = NULL; >> size_t unmapped = 0; >> - int ret = 0; >> + int ret = 0, retries; >> >> mask = ((uint64_t)1 << __ffs(vfio_pgsize_bitmap(iommu))) - 1; >> >> @@ -794,7 +794,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu, >> return -EINVAL; >> >> WARN_ON(mask & PAGE_MASK); >> - >> +again: >> mutex_lock(&iommu->lock); >> >> /* >> @@ -851,11 +851,16 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu, >> if (dma->task->mm != current->mm) >> break; >> >> - unmapped += dma->size; >> - >> - if (iommu->external_domain && !RB_EMPTY_ROOT(&dma->pfn_list)) { >> + if (!RB_EMPTY_ROOT(&dma->pfn_list)) { >> struct vfio_iommu_type1_dma_unmap nb_unmap; >> >> + if (dma_last == dma) { >> + BUG_ON(++retries > 10); >> + } else { >> + dma_last = dma; >> + retries = 0; >> + } >> + >> nb_unmap.iova = dma->iova; >> nb_unmap.size = dma->size; >> >> @@ -868,11 +873,11 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu, >> blocking_notifier_call_chain(&iommu->notifier, >> VFIO_IOMMU_NOTIFY_DMA_UNMAP, >> &nb_unmap); >> - mutex_lock(&iommu->lock); >> - if (WARN_ON(!RB_EMPTY_ROOT(&dma->pfn_list))) >> - break; >> + goto again: >> } >> + unmapped += dma->size; >> vfio_remove_dma(iommu, dma); >> + >> } >> >> unlock: >