Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752439AbbDBCeG (ORCPT ); Wed, 1 Apr 2015 22:34:06 -0400 Received: from mail-pd0-f171.google.com ([209.85.192.171]:34220 "EHLO mail-pd0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752364AbbDBCeD (ORCPT ); Wed, 1 Apr 2015 22:34:03 -0400 Message-ID: <551CAA95.8080903@ozlabs.ru> Date: Thu, 02 Apr 2015 13:33:57 +1100 From: Alexey Kardashevskiy User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Alex Williamson CC: linuxppc-dev@lists.ozlabs.org, Benjamin Herrenschmidt , Paul Mackerras , linux-kernel@vger.kernel.org Subject: Re: [PATCH kernel v7 12/31] powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group References: <1427468115-2224-1-git-send-email-aik@ozlabs.ru> <1427468115-2224-13-git-send-email-aik@ozlabs.ru> <1427924911.5567.304.camel@redhat.com> In-Reply-To: <1427924911.5567.304.camel@redhat.com> Content-Type: text/plain; charset=koi8-r; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5239 Lines: 120 On 04/02/2015 08:48 AM, Alex Williamson wrote: > On Sat, 2015-03-28 at 01:54 +1100, Alexey Kardashevskiy wrote: >> Modern IBM POWERPC systems support multiple (currently two) TCE tables >> per IOMMU group (a.k.a. PE). This adds a iommu_table_group container >> for TCE tables. Right now just one table is supported. >> >> Signed-off-by: Alexey Kardashevskiy >> --- >> Documentation/vfio.txt | 23 ++++++ >> arch/powerpc/include/asm/iommu.h | 18 +++-- >> arch/powerpc/kernel/iommu.c | 34 ++++---- >> arch/powerpc/platforms/powernv/pci-ioda.c | 38 +++++---- >> arch/powerpc/platforms/powernv/pci-p5ioc2.c | 17 ++-- >> arch/powerpc/platforms/powernv/pci.c | 2 +- >> arch/powerpc/platforms/powernv/pci.h | 4 +- >> arch/powerpc/platforms/pseries/iommu.c | 9 ++- >> drivers/vfio/vfio_iommu_spapr_tce.c | 120 ++++++++++++++++++++-------- >> 9 files changed, 183 insertions(+), 82 deletions(-) >> >> diff --git a/Documentation/vfio.txt b/Documentation/vfio.txt >> index 96978ec..94328c8 100644 >> --- a/Documentation/vfio.txt >> +++ b/Documentation/vfio.txt >> @@ -427,6 +427,29 @@ The code flow from the example above should be slightly changed: >> >> .... >> >> +5) There is v2 of SPAPR TCE IOMMU. It deprecates VFIO_IOMMU_ENABLE/ >> +VFIO_IOMMU_DISABLE and implements 2 new ioctls: >> +VFIO_IOMMU_SPAPR_REGISTER_MEMORY and VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY >> +(which are unsupported in v1 IOMMU). >> + >> +PPC64 paravirtualized guests generate a lot of map/unmap requests, >> +and the handling of those includes pinning/unpinning pages and updating >> +mm::locked_vm counter to make sure we do not exceed the rlimit. >> +The v2 IOMMU splits accounting and pinning into separate operations: >> + >> +- VFIO_IOMMU_SPAPR_REGISTER_MEMORY/VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY ioctls >> +receive a user space address and size of the block to be pinned. >> +Bisecting is not supported and VFIO_IOMMU_UNREGISTER_MEMORY is expected to >> +be called with the exact address and size used for registering >> +the memory block. The userspace is not expected to call these often. >> +The ranges are stored in a linked list in a VFIO container. >> + >> +- VFIO_IOMMU_MAP_DMA/VFIO_IOMMU_UNMAP_DMA ioctls only update the actual >> +IOMMU table and do not do pinning; instead these check that the userspace >> +address is from pre-registered range. >> + >> +This separation helps in optimizing DMA for guests. >> + >> ------------------------------------------------------------------------------- >> >> [1] VFIO was originally an acronym for "Virtual Function I/O" in its >> diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h >> index eb75726..667aa1a 100644 >> --- a/arch/powerpc/include/asm/iommu.h >> +++ b/arch/powerpc/include/asm/iommu.h >> @@ -90,9 +90,7 @@ struct iommu_table { >> struct iommu_pool pools[IOMMU_NR_POOLS]; >> unsigned long *it_map; /* A simple allocation bitmap for now */ >> unsigned long it_page_shift;/* table iommu page size */ >> -#ifdef CONFIG_IOMMU_API >> - struct iommu_group *it_group; >> -#endif >> + struct iommu_table_group *it_group; >> struct iommu_table_ops *it_ops; >> void (*set_bypass)(struct iommu_table *tbl, bool enable); >> }; >> @@ -126,14 +124,24 @@ extern void iommu_free_table(struct iommu_table *tbl, const char *node_name); >> */ >> extern struct iommu_table *iommu_init_table(struct iommu_table * tbl, >> int nid); >> + >> +#define IOMMU_TABLE_GROUP_MAX_TABLES 1 >> + >> +struct iommu_table_group { >> #ifdef CONFIG_IOMMU_API >> -extern void iommu_register_group(struct iommu_table *tbl, >> + struct iommu_group *group; >> +#endif >> + struct iommu_table tables[IOMMU_TABLE_GROUP_MAX_TABLES]; >> +}; >> + >> +#ifdef CONFIG_IOMMU_API >> +extern void iommu_register_group(struct iommu_table_group *table_group, >> int pci_domain_number, unsigned long pe_num); >> extern int iommu_add_device(struct device *dev); >> extern void iommu_del_device(struct device *dev); >> extern int __init tce_iommu_bus_notifier_init(void); >> #else >> -static inline void iommu_register_group(struct iommu_table *tbl, >> +static inline void iommu_register_group(struct iommu_table_group *table_group, >> int pci_domain_number, >> unsigned long pe_num) > > > Not a new problem, but there's some awfully liberal use of the namespace > with function names here. IOMMU API uses iommu_foo() functions. IOMMU > group related interfaces within the IOMMU API include "group" somewhere > in that name. powerpc specific functions should include a tag to avoid > causing conflicts there. Cannot argue with that but it is kind of late or not for this patchset, no? And iommu_table is way too generic for powerpc/spapr-specific thing. I can replace with something better, should I do this now? > (sorry for commenting twice on the same patch) > -- Alexey -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/