Date: Tue, 8 Nov 2016 12:14:23 -0700
From: Alex Williamson <alex.williamson@redhat.com>
To: Kirti Wankhede <kwankhede@nvidia.com>
Cc: <pbonzini@redhat.com>, <kraxel@redhat.com>, <cjia@nvidia.com>,
        <qemu-devel@nongnu.org>, <kvm@vger.kernel.org>, <kevin.tian@intel.com>,
        <jike.song@intel.com>, <bjsdjshi@linux.vnet.ibm.com>,
        <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v11 05/22] vfio iommu: Added pin and unpin callback
 functions to vfio_iommu_driver_ops
Message-ID: <20161108121423.352d4505@t450s.home>
In-Reply-To: <67cd108d-0792-d2d4-29d9-7560e472e7ce@nvidia.com>
References: <1478293856-8191-1-git-send-email-kwankhede@nvidia.com>
        <1478293856-8191-6-git-send-email-kwankhede@nvidia.com>
        <20161107123649.0f3d56e9@t450s.home>
        <f289d950-6c1a-69f8-4658-4a7a85ea8c0e@nvidia.com>
        <20161108093952.58ced4d7@t450s.home>
        <67cd108d-0792-d2d4-29d9-7560e472e7ce@nvidia.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1989
Lines: 46

On Wed, 9 Nov 2016 00:17:53 +0530
Kirti Wankhede <kwankhede@nvidia.com> wrote:

> On 11/8/2016 10:09 PM, Alex Williamson wrote:
> > On Tue, 8 Nov 2016 19:25:35 +0530
> > Kirti Wankhede <kwankhede@nvidia.com> wrote:
> >   
> ...
> 
> >>>> -
> >>>> +	int		(*pin_pages)(void *iommu_data, unsigned long *user_pfn,
> >>>> +				     int npage, int prot,
> >>>> +				     unsigned long *phys_pfn);
> >>>> +	int		(*unpin_pages)(void *iommu_data,    
> >>>
> >>> Are we changing from long to int here simply because of the absurdity
> >>> in passing in more than a 2^31 entry array, that would already consume
> >>> more than 16GB itself?
> >>>     
> >>
> >> These are on demand pin/unpin request, will that request go beyond 16GB
> >> limit? For Nvidia vGPU solution, pin request will not go beyond this limit.  
> > 
> > 16G is simply the size of the user_pfn or phys_pfn arrays at a maximal
> > int32_t npage value, the interface actually allows mapping up to 8TB
> > per call, but at that point we have 16GB of input, 16GB of output, and
> > 80GB of vfio_pfns created.  So I don't really have a problem changing
> > form long to int given lack of scalability in the API in general, but
> > it does make me second guess the API itself.  Thanks,
> >   
> 
> Changing to 'long', in future we might enhance this API without changing
> it signature.

I think the pfn arrays are more of a problem long term than whether we
can only map 2^31 pfns in one call.  I particularly dislike that the
caller provides both the iova and pfn arrays for unpinning.  Being an
in-kernel driver, we should trust it, but it makes the interface
difficult to use and seems like it indicates that our tracking data
structures aren't architected the way they should be.  Upstream, this
API will need to be flexible and change over time, it's only the
downstream distros that may lock in a kABI.  Not breaking them should
be a consideration, but also needs to be weighted against long term
upstream goals.  Thanks,

Alex