Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752088AbcKBEKI (ORCPT ); Wed, 2 Nov 2016 00:10:08 -0400 Received: from mail-pf0-f193.google.com ([209.85.192.193]:34957 "EHLO mail-pf0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751453AbcKBEKG (ORCPT ); Wed, 2 Nov 2016 00:10:06 -0400 Subject: Re: [Qemu-devel] [PATCH v9 04/12] vfio iommu: Add support for mediated devices To: Kirti Wankhede , alex.williamson@redhat.com, pbonzini@redhat.com, kraxel@redhat.com, cjia@nvidia.com References: <1476739332-4911-1-git-send-email-kwankhede@nvidia.com> <1476739332-4911-5-git-send-email-kwankhede@nvidia.com> <62ade373-6edc-c7f3-c205-200cf4fd211f@nvidia.com> <45b517de-3766-e96b-fec0-2b77da4dcf8d@nvidia.com> Cc: jike.song@intel.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, kevin.tian@intel.com, qemu-devel@nongnu.org, bjsdjshi@linux.vnet.ibm.com From: Alexey Kardashevskiy Message-ID: <695ca09f-b332-d33a-22fb-073f03dfaebf@ozlabs.ru> Date: Wed, 2 Nov 2016 15:09:56 +1100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <45b517de-3766-e96b-fec0-2b77da4dcf8d@nvidia.com> Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3275 Lines: 89 On 02/11/16 14:29, Kirti Wankhede wrote: > > > On 11/2/2016 6:54 AM, Alexey Kardashevskiy wrote: >> On 02/11/16 01:01, Kirti Wankhede wrote: >>> >>> >>> On 10/28/2016 7:48 AM, Alexey Kardashevskiy wrote: >>>> On 27/10/16 23:31, Kirti Wankhede wrote: >>>>> >>>>> >>>>> On 10/27/2016 12:50 PM, Alexey Kardashevskiy wrote: >>>>>> On 18/10/16 08:22, Kirti Wankhede wrote: >>>>>>> VFIO IOMMU drivers are designed for the devices which are IOMMU capable. >>>>>>> Mediated device only uses IOMMU APIs, the underlying hardware can be >>>>>>> managed by an IOMMU domain. >>>>>>> >>>>>>> Aim of this change is: >>>>>>> - To use most of the code of TYPE1 IOMMU driver for mediated devices >>>>>>> - To support direct assigned device and mediated device in single module >>>>>>> >>>>>>> Added two new callback functions to struct vfio_iommu_driver_ops. Backend >>>>>>> IOMMU module that supports pining and unpinning pages for mdev devices >>>>>>> should provide these functions. >>>>>>> Added APIs for pining and unpining pages to VFIO module. These calls back >>>>>>> into backend iommu module to actually pin and unpin pages. >>>>>>> >>>>>>> This change adds pin and unpin support for mediated device to TYPE1 IOMMU >>>>>>> backend module. More details: >>>>>>> - When iommu_group of mediated devices is attached, task structure is >>>>>>> cached which is used later to pin pages and page accounting. >>>>>> >>>>>> >>>>>> For SPAPR TCE IOMMU driver, I ended up caching mm_struct with >>>>>> atomic_inc(&container->mm->mm_count) (patches are on the way) instead of >>>>>> using @current or task as the process might be gone while VFIO container is >>>>>> still alive and @mm might be needed to do proper cleanup; this might not be >>>>>> an issue with this patchset now but still you seem to only use @mm from >>>>>> task_struct. >>>>>> >>>>> >>>>> Consider the example of QEMU process which creates VFIO container, QEMU >>>>> in its teardown path would release the container. How could container be >>>>> alive when process is gone? >>>> >>>> do_exit() in kernel/exit.c calls exit_mm() (which sets NULL to tsk->mm) >>>> first, and then releases open files by calling exit_files(). So >>>> container's release() does not have current->mm. >>>> >>> >>> Incrementing usage count (get_task_struct()) while saving task structure >>> and decementing it (put_task_struct()) from release() should work here. >>> Updating the patch. >> >> I cannot see how the task->usage counter prevents do_exit() from performing >> the exit, can you? >> > > It will not prevent exit from do_exit(), but that will make sure that we > don't have stale pointer of task structure. Then we can check whether > the task is alive and get mm pointer in teardown path as below: Or you could just reference and use @mm as KVM and others do. Or there is anything else you need from @current than just @mm? > > { > struct task_struct *task = domain->external_addr_space->task; > struct mm_struct *mm = NULL; > > put_pfn(pfn, prot); > > if (pid_alive(task)) > mm = get_task_mm(task); > > if (mm) { > if (do_accounting) > vfio_lock_acct(task, -1); > > mmput(mm); > } > } -- Alexey