Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752375AbaGUDDk (ORCPT ); Sun, 20 Jul 2014 23:03:40 -0400 Received: from mail-qg0-f50.google.com ([209.85.192.50]:45122 "EHLO mail-qg0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751673AbaGUDDi (ORCPT ); Sun, 20 Jul 2014 23:03:38 -0400 Date: Sun, 20 Jul 2014 23:03:31 -0400 From: Jerome Glisse To: Oded Gabbay Cc: David Airlie , Alex Deucher , Andrew Morton , John Bridgman , Joerg Roedel , Andrew Lewycky , Christian =?iso-8859-1?Q?K=F6nig?= , Michel =?iso-8859-1?Q?D=E4nzer?= , Ben Goz , Alexey Skidanov , Evgeny Pinchuk , "linux-kernel@vger.kernel.org" , "dri-devel@lists.freedesktop.org" , linux-mm Subject: Re: [PATCH v2 00/25] AMDKFD kernel driver Message-ID: <20140721030330.GN3068@gmail.com> References: <53C7D645.3070607@amd.com> <20140720174652.GE3068@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20140720174652.GE3068@gmail.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Jul 20, 2014 at 01:46:53PM -0400, Jerome Glisse wrote: > On Thu, Jul 17, 2014 at 04:57:25PM +0300, Oded Gabbay wrote: > > Forgot to cc mailing list on cover letter. Sorry. > > > > As a continuation to the existing discussion, here is a v2 patch series > > restructured with a cleaner history and no totally-different-early-versions > > of the code. > > > > Instead of 83 patches, there are now a total of 25 patches, where 5 of them > > are modifications to radeon driver and 18 of them include only amdkfd code. > > There is no code going away or even modified between patches, only added. > > > > The driver was renamed from radeon_kfd to amdkfd and moved to reside under > > drm/radeon/amdkfd. This move was done to emphasize the fact that this driver > > is an AMD-only driver at this point. Having said that, we do foresee a > > generic hsa framework being implemented in the future and in that case, we > > will adjust amdkfd to work within that framework. > > > > As the amdkfd driver should support multiple AMD gfx drivers, we want to > > keep it as a seperate driver from radeon. Therefore, the amdkfd code is > > contained in its own folder. The amdkfd folder was put under the radeon > > folder because the only AMD gfx driver in the Linux kernel at this point > > is the radeon driver. Having said that, we will probably need to move it > > (maybe to be directly under drm) after we integrate with additional AMD gfx > > drivers. > > > > For people who like to review using git, the v2 patch set is located at: > > http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v2 > > > > Written by Oded Gabbayh > > So quick comments before i finish going over all patches. There is many > things that need more documentation espacialy as of right now there is > no userspace i can go look at. > > There few show stopper, biggest one is gpu memory pinning this is a big > no, that would need serious arguments for any hope of convincing me on > that side. > > It might be better to add a drivers/gpu/drm/amd directory and add common > stuff there. > > Given that this is not intended to be final HSA api AFAICT then i would > say this far better to avoid the whole kfd module and add ioctl to radeon. > This would avoid crazy communication btw radeon and kfd. > > The whole aperture business needs some serious explanation. Especialy as > you want to use userspace address there is nothing to prevent userspace > program from allocating things at address you reserve for lds, scratch, > ... only sane way would be to move those lds, scratch inside the virtual > address reserved for kernel (see kernel memory map). So i skimmed over the iommu v2 specification and while the iommu v2 claims to obey the user/supervisor flags of the cpu page table, it does not seems that this is a property set in the iommu against a pasid (ie is a given pasid is allow supervisor access or not). It seems that the supervisor is part of pcie tlp request which i assume is control by the gpu. So how is this bit set ? How can we make sure that there is no way to abuse it ? > > The whole business of locking performance counter for exclusive per process > access is a big NO. Which leads me to the questionable usefullness of user > space command ring. I only see issues with that. First and foremost i would > need to see solid figures that kernel ioctl or syscall has a higher an > overhead that is measurable in any meaning full way against a simple > function call. I know the userspace command ring is a big marketing features > that please ignorant userspace programmer. But really this only brings issues > and for absolutely not upside afaict. > > So i would rather see a very simple ioctl that write the doorbell and might > do more than that in case of ring/queue overcommit where it would first have > to wait for a free ring/queue to schedule stuff. This would also allow sane > implementation of things like performance counter that could be acquire by > kernel for duration of a job submitted by userspace. While still not optimal > this would be better that userspace locking. > > > I might have more thoughts once i am done with all the patches. > > Cheers, > J?r?me > > > > > Original Cover Letter: > > > > This patch set implements a Heterogeneous System Architecture (HSA) driver > > for radeon-family GPUs. > > HSA allows different processor types (CPUs, DSPs, GPUs, etc..) to share > > system resources more effectively via HW features including shared pageable > > memory, userspace-accessible work queues, and platform-level atomics. In > > addition to the memory protection mechanisms in GPUVM and IOMMUv2, the Sea > > Islands family of GPUs also performs HW-level validation of commands passed > > in through the queues (aka rings). > > > > The code in this patch set is intended to serve both as a sample driver for > > other HSA-compatible hardware devices and as a production driver for > > radeon-family processors. The code is architected to support multiple CPUs > > each with connected GPUs, although the current implementation focuses on a > > single Kaveri/Berlin APU, and works alongside the existing radeon kernel > > graphics driver (kgd). > > AMD GPUs designed for use with HSA (Sea Islands and up) share some hardware > > functionality between HSA compute and regular gfx/compute (memory, > > interrupts, registers), while other functionality has been added > > specifically for HSA compute (hw scheduler for virtualized compute rings). > > All shared hardware is owned by the radeon graphics driver, and an interface > > between kfd and kgd allows the kfd to make use of those shared resources, > > while HSA-specific functionality is managed directly by kfd by submitting > > packets into an HSA-specific command queue (the "HIQ"). > > > > During kfd module initialization a char device node (/dev/kfd) is created > > (surviving until module exit), with ioctls for queue creation & management, > > and data structures are initialized for managing HSA device topology. > > The rest of the initialization is driven by calls from the radeon kgd at the > > following points : > > > > - radeon_init (kfd_init) > > - radeon_exit (kfd_fini) > > - radeon_driver_load_kms (kfd_device_probe, kfd_device_init) > > - radeon_driver_unload_kms (kfd_device_fini) > > > > During the probe and init processing per-device data structures are > > established which connect to the associated graphics kernel driver. This > > information is exposed to userspace via sysfs, along with a version number > > allowing userspace to determine if a topology change has occurred while it > > was reading from sysfs. > > The interface between kfd and kgd also allows the kfd to request buffer > > management services from kgd, and allows kgd to route interrupt requests to > > kfd code since the interrupt block is shared between regular > > graphics/compute and HSA compute subsystems in the GPU. > > > > The kfd code works with an open source usermode library ("libhsakmt") which > > is in the final stages of IP review and should be published in a separate > > repo over the next few days. > > The code operates in one of three modes, selectable via the sched_policy > > module parameter : > > > > - sched_policy=0 uses a hardware scheduler running in the MEC block within > > CP, and allows oversubscription (more queues than HW slots) > > - sched_policy=1 also uses HW scheduling but does not allow > > oversubscription, so create_queue requests fail when we run out of HW slots > > - sched_policy=2 does not use HW scheduling, so the driver manually assigns > > queues to HW slots by programming registers > > > > The "no HW scheduling" option is for debug & new hardware bringup only, so > > has less test coverage than the other options. Default in the current code > > is "HW scheduling without oversubscription" since that is where we have the > > most test coverage but we expect to change the default to "HW scheduling > > with oversubscription" after further testing. This effectively removes the > > HW limit on the number of work queues available to applications. > > > > Programs running on the GPU are associated with an address space through the > > VMID field, which is translated to a unique PASID at access time via a set > > of 16 VMID-to-PASID mapping registers. The available VMIDs (currently 16) > > are partitioned (under control of the radeon kgd) between current > > gfx/compute and HSA compute, with each getting 8 in the current code. The > > VMID-to-PASID mapping registers are updated by the HW scheduler when used, > > and by driver code if HW scheduling is not being used. > > The Sea Islands compute queues use a new "doorbell" mechanism instead of the > > earlier kernel-managed write pointer registers. Doorbells use a separate BAR > > dedicated for this purpose, and pages within the doorbell aperture are > > mapped to userspace (each page mapped to only one user address space). > > Writes to the doorbell aperture are intercepted by GPU hardware, allowing > > userspace code to safely manage work queues (rings) without requiring a > > kernel call for every ring update. > > First step for an application process is to open the kfd device. Calls to > > open create a kfd "process" structure only for the first thread of the > > process. Subsequent open calls are checked to see if they are from processes > > using the same mm_struct and, if so, don't do anything. The kfd per-process > > data lives as long as the mm_struct exists. Each mm_struct is associated > > with a unique PASID, allowing the IOMMUv2 to make userspace process memory > > accessible to the GPU. > > Next step is for the application to collect topology information via sysfs. > > This gives userspace enough information to be able to identify specific > > nodes (processors) in subsequent queue management calls. Application > > processes can create queues on multiple processors, and processors support > > queues from multiple processes. > > At this point the application can create work queues in userspace memory and > > pass them through the usermode library to kfd to have them mapped onto HW > > queue slots so that commands written to the queues can be executed by the > > GPU. Queue operations specify a processor node, and so the bulk of this code > > is device-specific. > > Written by John Bridgman > > > > > > Alexey Skidanov (1): > > amdkfd: Implement the Get Process Aperture IOCTL > > > > Andrew Lewycky (3): > > amdkfd: Add basic modules to amdkfd > > amdkfd: Add interrupt handling module > > amdkfd: Implement the Set Memory Policy IOCTL > > > > Ben Goz (8): > > amdkfd: Add queue module > > amdkfd: Add mqd_manager module > > amdkfd: Add kernel queue module > > amdkfd: Add module parameter of scheduling policy > > amdkfd: Add packet manager module > > amdkfd: Add process queue manager module > > amdkfd: Add device queue manager module > > amdkfd: Implement the create/destroy/update queue IOCTLs > > > > Evgeny Pinchuk (3): > > amdkfd: Add topology module to amdkfd > > amdkfd: Implement the Get Clock Counters IOCTL > > amdkfd: Implement the PMC Acquire/Release IOCTLs > > > > Oded Gabbay (10): > > mm: Add kfd_process pointer to mm_struct > > drm/radeon: reduce number of free VMIDs and pipes in KV > > drm/radeon/cik: Don't touch int of pipes 1-7 > > drm/radeon: Report doorbell configuration to amdkfd > > drm/radeon: adding synchronization for GRBM GFX > > drm/radeon: Add radeon <--> amdkfd interface > > Update MAINTAINERS and CREDITS files with amdkfd info > > amdkfd: Add IOCTL set definitions of amdkfd > > amdkfd: Add amdkfd skeleton driver > > amdkfd: Add binding/unbinding calls to amd_iommu driver > > > > CREDITS | 7 + > > MAINTAINERS | 10 + > > drivers/gpu/drm/radeon/Kconfig | 2 + > > drivers/gpu/drm/radeon/Makefile | 3 + > > drivers/gpu/drm/radeon/amdkfd/Kconfig | 10 + > > drivers/gpu/drm/radeon/amdkfd/Makefile | 14 + > > drivers/gpu/drm/radeon/amdkfd/cik_mqds.h | 185 +++ > > drivers/gpu/drm/radeon/amdkfd/cik_regs.h | 220 ++++ > > drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c | 123 ++ > > drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c | 518 +++++++++ > > drivers/gpu/drm/radeon/amdkfd/kfd_crat.h | 294 +++++ > > drivers/gpu/drm/radeon/amdkfd/kfd_device.c | 254 ++++ > > .../drm/radeon/amdkfd/kfd_device_queue_manager.c | 985 ++++++++++++++++ > > .../drm/radeon/amdkfd/kfd_device_queue_manager.h | 101 ++ > > drivers/gpu/drm/radeon/amdkfd/kfd_doorbell.c | 264 +++++ > > drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c | 161 +++ > > drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.c | 305 +++++ > > drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.h | 66 ++ > > drivers/gpu/drm/radeon/amdkfd/kfd_module.c | 131 +++ > > drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.c | 291 +++++ > > drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.h | 54 + > > drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c | 488 ++++++++ > > drivers/gpu/drm/radeon/amdkfd/kfd_pasid.c | 97 ++ > > drivers/gpu/drm/radeon/amdkfd/kfd_pm4_headers.h | 682 +++++++++++ > > drivers/gpu/drm/radeon/amdkfd/kfd_pm4_opcodes.h | 107 ++ > > drivers/gpu/drm/radeon/amdkfd/kfd_priv.h | 466 ++++++++ > > drivers/gpu/drm/radeon/amdkfd/kfd_process.c | 405 +++++++ > > .../drm/radeon/amdkfd/kfd_process_queue_manager.c | 343 ++++++ > > drivers/gpu/drm/radeon/amdkfd/kfd_queue.c | 109 ++ > > drivers/gpu/drm/radeon/amdkfd/kfd_topology.c | 1207 ++++++++++++++++++++ > > drivers/gpu/drm/radeon/amdkfd/kfd_topology.h | 168 +++ > > drivers/gpu/drm/radeon/amdkfd/kfd_vidmem.c | 96 ++ > > drivers/gpu/drm/radeon/cik.c | 154 +-- > > drivers/gpu/drm/radeon/cik_reg.h | 65 ++ > > drivers/gpu/drm/radeon/cikd.h | 51 +- > > drivers/gpu/drm/radeon/radeon.h | 9 + > > drivers/gpu/drm/radeon/radeon_device.c | 32 + > > drivers/gpu/drm/radeon/radeon_drv.c | 5 + > > drivers/gpu/drm/radeon/radeon_kfd.c | 566 +++++++++ > > drivers/gpu/drm/radeon/radeon_kfd.h | 119 ++ > > drivers/gpu/drm/radeon/radeon_kms.c | 7 + > > include/linux/mm_types.h | 14 + > > include/uapi/linux/kfd_ioctl.h | 133 +++ > > 43 files changed, 9226 insertions(+), 95 deletions(-) > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/Kconfig > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/Makefile > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/cik_mqds.h > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/cik_regs.h > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_aperture.c > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_chardev.c > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_crat.h > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_device.c > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.c > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_device_queue_manager.h > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_doorbell.c > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_interrupt.c > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.c > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_kernel_queue.h > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_module.c > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.c > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_mqd_manager.h > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_packet_manager.c > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_pasid.c > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_pm4_headers.h > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_pm4_opcodes.h > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_priv.h > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_process.c > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_process_queue_manager.c > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_queue.c > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_topology.c > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_topology.h > > create mode 100644 drivers/gpu/drm/radeon/amdkfd/kfd_vidmem.c > > create mode 100644 drivers/gpu/drm/radeon/radeon_kfd.c > > create mode 100644 drivers/gpu/drm/radeon/radeon_kfd.h > > create mode 100644 include/uapi/linux/kfd_ioctl.h > > > > -- > > 1.9.1 > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/