Received: by 2002:a25:86ce:0:0:0:0:0 with SMTP id y14csp1921298ybm; Thu, 23 May 2019 08:38:56 -0700 (PDT) X-Google-Smtp-Source: APXvYqxjfSOd8qh/O15ZijoCfdM0mnzad+tlfJefQflofozboVCUV4iLk+RMN0Go2Hw5YGmYHHQt X-Received: by 2002:a17:90a:cf87:: with SMTP id i7mr2059328pju.72.1558625935918; Thu, 23 May 2019 08:38:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1558625935; cv=none; d=google.com; s=arc-20160816; b=VB+paDWeJFfFkbsNiAuvIYXA38Sw9Cvr/Lm5MKYQV4eImKHgBzQ8FmBdWKKhDne9r7 tPDpumd62Evp8RYqND1HebHRuIZoaKVm6L7lcZZLms7ePE30yiHLU/B6Seo/kGnitSIF uU+hyoQGJA8J8Air51TXjlU/o4W7lp35tWSjf3Q03SWMVQIQBCgf4iAaOXulHjQxSkYl IMxHU9i3LnC4wyb7FFlJXRSAaF0YjK10JrDk0Fx/j+Sm4g+DHwK/REWzM+q5DEzA9Tz7 xdvD5BNgo1ZNteiyJgv9AYhg+3DMb8K61juBv0ApDnCHWRGn4D0y7Ek1bIDlFdO9bEwG BR9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date; bh=6Jb4nCs2jSlJKsZZR+0ycXLbnq18cD5eV0UlBaMDXUw=; b=ajD8DcZ+o1g5zmEXjLBI7rWjJqFR+8dWFdn4QxfxPlQci17tK6eZCvjRHWrvM6J7t7 tb8iJ2BCpdNuudmAvpgz87/P2KQSxffl4vPH/0dPFvyF60MkQ3xfAQUHEv13HkgKicoD cBnMMx6GmUWbjtNHOA3OiOghFwE/rOKJWqwxqRnON6wW+CSGmh7cktNsH1mi98VeFVhm +z4KziaqUXmw4CMitY/zp73ZGGZugtZRPLxl60+V7d6kbly/bsTih8UklyMAG3j4nfiW 5eksvozvS5kKeuQa3dJzubhSmEzvKx4AbIKhjsW6s0yHHrx73/IVaEsFb+KrRX4RpuMu AX1Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f15si31409146pgu.218.2019.05.23.08.38.38; Thu, 23 May 2019 08:38:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731119AbfEWPhW (ORCPT + 99 others); Thu, 23 May 2019 11:37:22 -0400 Received: from mga03.intel.com ([134.134.136.65]:65249 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730859AbfEWPhW (ORCPT ); Thu, 23 May 2019 11:37:22 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 23 May 2019 08:37:21 -0700 X-ExtLoop1: 1 Received: from jacob-builder.jf.intel.com (HELO jacob-builder) ([10.7.199.155]) by orsmga003.jf.intel.com with ESMTP; 23 May 2019 08:37:21 -0700 Date: Thu, 23 May 2019 08:40:19 -0700 From: Jacob Pan To: Auger Eric Cc: iommu@lists.linux-foundation.org, LKML , Joerg Roedel , David Woodhouse , Alex Williamson , Jean-Philippe Brucker , Yi Liu , "Tian, Kevin" , Raj Ashok , Christoph Hellwig , Lu Baolu , Andriy Shevchenko , jacob.jun.pan@linux.intel.com Subject: Re: [PATCH v3 04/16] ioasid: Add custom IOASID allocator Message-ID: <20190523084019.7f940aa5@jacob-builder> In-Reply-To: References: <1556922737-76313-1-git-send-email-jacob.jun.pan@linux.intel.com> <1556922737-76313-5-git-send-email-jacob.jun.pan@linux.intel.com> <20190522124255.117de6f6@jacob-builder> Organization: OTC X-Mailer: Claws Mail 3.13.2 (GTK+ 2.24.30; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 23 May 2019 09:14:07 +0200 Auger Eric wrote: > Hi Jacob, > > On 5/22/19 9:42 PM, Jacob Pan wrote: > > On Tue, 21 May 2019 11:55:55 +0200 > > Auger Eric wrote: > > > >> Hi Jacob, > >> > >> On 5/4/19 12:32 AM, Jacob Pan wrote: > >>> Sometimes, IOASID allocation must be handled by platform specific > >>> code. The use cases are guest vIOMMU and pvIOMMU where IOASIDs > >>> need to be allocated by the host via enlightened or paravirt > >>> interfaces. > >>> > >>> This patch adds an extension to the IOASID allocator APIs such > >>> that platform drivers can register a custom allocator, possibly > >>> at boot time, to take over the allocation. Xarray is still used > >>> for tracking and searching purposes internal to the IOASID code. > >>> Private data of an IOASID can also be set after the allocation. > >>> > >>> There can be multiple custom allocators registered but only one is > >>> used at a time. In case of hot removal of devices that provides > >>> the allocator, all IOASIDs must be freed prior to unregistering > >>> the allocator. Default XArray based allocator cannot be mixed with > >>> custom allocators, i.e. custom allocators will not be used if > >>> there are outstanding IOASIDs allocated by the default XA > >>> allocator. > >>> > >>> Signed-off-by: Jacob Pan > >>> --- > >>> drivers/iommu/ioasid.c | 125 > >>> +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, > >>> 125 insertions(+) > >>> > >>> diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c > >>> index 99f5e0a..ed2915a 100644 > >>> --- a/drivers/iommu/ioasid.c > >>> +++ b/drivers/iommu/ioasid.c > >>> @@ -17,6 +17,100 @@ struct ioasid_data { > >>> }; > >>> > >>> static DEFINE_XARRAY_ALLOC(ioasid_xa); > >>> +static DEFINE_MUTEX(ioasid_allocator_lock); > >>> +static struct ioasid_allocator *active_custom_allocator; > >>> + > >>> +static LIST_HEAD(custom_allocators); > >>> +/* > >>> + * A flag to track if ioasid default allocator is in use, this > >>> will > >>> + * prevent custom allocator from being used. The reason is that > >>> custom allocator > >>> + * must have unadulterated space to track private data with > >>> xarray, there cannot > >>> + * be a mix been default and custom allocated IOASIDs. > >>> + */ > >>> +static int default_allocator_active; > >>> + > >>> +/** > >>> + * ioasid_register_allocator - register a custom allocator > >>> + * @allocator: the custom allocator to be registered > >>> + * > >>> + * Custom allocators take precedence over the default xarray > >>> based allocator. > >>> + * Private data associated with the ASID are managed by ASID > >>> common code > >>> + * similar to data stored in xa. > >>> + * > >>> + * There can be multiple allocators registered but only one is > >>> active. In case > >>> + * of runtime removal of a custom allocator, the next one is > >>> activated based > >>> + * on the registration ordering. > >>> + */ > >>> +int ioasid_register_allocator(struct ioasid_allocator *allocator) > >>> +{ > >>> + struct ioasid_allocator *pallocator; > >>> + int ret = 0; > >>> + > >>> + if (!allocator) > >>> + return -EINVAL; > >> is it really necessary? Sin't it the caller responsibility? > > makes sense. will remove this one and below. > >>> + > >>> + mutex_lock(&ioasid_allocator_lock); > >>> + /* > >>> + * No particular preference since all custom allocators > >>> end up calling > >>> + * the host to allocate IOASIDs. We activate the first > >>> one and keep > >>> + * the later registered allocators in a list in case the > >>> first one gets > >>> + * removed due to hotplug. > >>> + */ > >>> + if (list_empty(&custom_allocators)) > >>> + active_custom_allocator = allocator;> + > >>> else { > >>> + /* Check if the allocator is already registered > >>> */ > >>> + list_for_each_entry(pallocator, > >>> &custom_allocators, list) { > >>> + if (pallocator == allocator) { > >>> + pr_err("IOASID allocator already > >>> registered\n"); > >>> + ret = -EEXIST; > >>> + goto out_unlock; > >>> + } > >>> + } > >>> + } > >>> + list_add_tail(&allocator->list, &custom_allocators); > >>> + > >>> +out_unlock: > >>> + mutex_unlock(&ioasid_allocator_lock); > >>> + return ret; > >>> +} > >>> +EXPORT_SYMBOL_GPL(ioasid_register_allocator); > >>> + > >>> +/** > >>> + * ioasid_unregister_allocator - Remove a custom IOASID allocator > >>> + * @allocator: the custom allocator to be removed > >>> + * > >>> + * Remove an allocator from the list, activate the next allocator > >>> in > >>> + * the order it was registered. > >>> + */ > >>> +void ioasid_unregister_allocator(struct ioasid_allocator > >>> *allocator) +{ > >>> + if (!allocator) > >>> + return; > >> is it really necessary? > >>> + > >>> + if (list_empty(&custom_allocators)) { > >>> + pr_warn("No custom IOASID allocators active!\n"); > >>> + return; > >>> + } > >>> + > >>> + mutex_lock(&ioasid_allocator_lock); > >>> + list_del(&allocator->list); > >>> + if (list_empty(&custom_allocators)) { > >>> + pr_info("No custom IOASID allocators\n")> > >>> + /* > >>> + * All IOASIDs should have been freed before the > >>> last custom > >>> + * allocator is unregistered. Unless default > >>> allocator is in > >>> + * use. > >>> + */ > >>> + BUG_ON(!xa_empty(&ioasid_xa) > >>> && !default_allocator_active); > >>> + active_custom_allocator = NULL; > >>> + } else if (allocator == active_custom_allocator) { > >> In case you are removing the active custom allocator don't you also > >> need to check that all ioasids were freed. Otherwise you are likely > >> to switch to a different allocator whereas the asid space is > >> partially populated. > > The assumption is that all custom allocators on the same guest will > > end up calling the same host allocator. Having multiple custom > > allocators in the list is just a way to support multiple (p)vIOMMUs > > with hotplug. Therefore, we cannot nor need to free all PASIDs when > > one custom allocator goes away. This is a different situation then > > switching between default allocator and custom allocator, where > > custom allocator has to start with a clean space. > Although I understand your specific usecase, this framework may have > other users, where custom allocators behave differently. > > Also the commit msg says:" > In case of hot removal of devices that provides the > allocator, all IOASIDs must be freed prior to unregistering the > allocator." > Right, it is inconsistent. Consider the following scenario on a single guest with two vIOMMUs: 1. vIOMMU1 register allocator A1 first 2. vIOMMU2 register allocator A2 stored to the allocator list 3. device belong to vIOMMU1 bind_sva(), allocate PASID1 from A1 4. device belong to vIOMMU2 bind_sva(), allocate PASID2 from A1 5. vIOMMU1 hot removed, free PASID1, then unregister A1 6. IOASID framework will try to free A1 then install A2 as the active allocator but PASID2 is in use. It will be unnecessarily disruptive to free PASID2 I can think of a solution: - Add a flag when registering ioasid custom allocator, IOASID_ALLOC_RETAIN, which means when switching to another custom allocator, all outstanding PASIDs will be retained. Of course it does not include switching to default allocator which does not have this RETAIN flag. - For the allocators do not have this flag, their PASIDs must be freed upon unregistering. Any thoughts? Jacob > Thanks > > Eric > > > > > >>> + active_custom_allocator = > >>> list_entry(&custom_allocators, struct ioasid_allocator, list); > >>> + pr_info("IOASID allocator changed"); > >>> + } > >>> + mutex_unlock(&ioasid_allocator_lock); > >>> +} > >>> +EXPORT_SYMBOL_GPL(ioasid_unregister_allocator); > >>> > >>> /** > >>> * ioasid_set_data - Set private data for an allocated ioasid > >>> @@ -68,6 +162,29 @@ ioasid_t ioasid_alloc(struct ioasid_set *set, > >>> ioasid_t min, ioasid_t max, data->set = set; > >>> data->private = private; > >>> > >>> + mutex_lock(&ioasid_allocator_lock); > >>> + /* > >>> + * Use custom allocator if available, otherwise use > >>> default. > >>> + * However, if there are active IOASIDs already been > >>> allocated by default > >>> + * allocator, custom allocator cannot be used. > >>> + */ > >>> + if (!default_allocator_active && > >>> active_custom_allocator) { > >>> + id = active_custom_allocator->alloc(min, max, > >>> active_custom_allocator->pdata); > >>> + if (id == INVALID_IOASID) { > >>> + pr_err("Failed ASID allocation by custom > >>> allocator\n"); > >>> + mutex_unlock(&ioasid_allocator_lock); > >>> + goto exit_free; > >>> + } > >>> + /* > >>> + * Use XA to manage private data also sanitiy > >>> check custom > >>> + * allocator for duplicates. > >>> + */ > >>> + min = id; > >>> + max = id + 1; > >>> + } else > >>> + default_allocator_active = 1; > >> nit: true? > > yes, i can turn default_allocator_active into a bool type. > > > >>> + mutex_unlock(&ioasid_allocator_lock); > >>> + > >>> if (xa_alloc(&ioasid_xa, &id, data, XA_LIMIT(min, max), > >>> GFP_KERNEL)) { pr_err("Failed to alloc ioasid from %d to %d\n", > >>> min, max); goto exit_free;> @@ -91,9 +208,17 @@ void > >>> ioasid_free(ioasid_t ioasid) { > >>> struct ioasid_data *ioasid_data; > >>> > >>> + mutex_lock(&ioasid_allocator_lock); > >>> + if (active_custom_allocator) > >>> + active_custom_allocator->free(ioasid, > >>> active_custom_allocator->pdata); > >>> + mutex_unlock(&ioasid_allocator_lock); > >>> + > >>> ioasid_data = xa_erase(&ioasid_xa, ioasid); > >>> > >>> kfree_rcu(ioasid_data, rcu); > >>> + > >>> + if (xa_empty(&ioasid_xa)) > >>> + default_allocator_active = 0; > >> Isn't it racy? what if an xa_alloc occurs inbetween? > >> > >> > > yes, i will move it under the mutex. Thanks. > >>> } > >>> EXPORT_SYMBOL_GPL(ioasid_free); > >>> > >>> > >> > >> Thanks > >> > >> Eric > > > > [Jacob Pan] > > [Jacob Pan]