Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp1410644ybt; Thu, 18 Jun 2020 08:05:33 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx0TlQi4V/SbGcmDs9U3WprCPRStq0WPpvozBsg+4Vm94GJsUdjV2ZDc8gOlNEE5XUITjmC X-Received: by 2002:a17:907:685:: with SMTP id wn5mr4584979ejb.283.1592492733341; Thu, 18 Jun 2020 08:05:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1592492733; cv=none; d=google.com; s=arc-20160816; b=RtaWdx9EU8aC/qUuIZ43aV/fQoWduWe6uxrJEKP2aM1w/gJJ4X6IymLvhbvFaqlVza OWtmPVsfD1D40gRavpzkTFsCIE3MZbjt/KbdfOeXwy4vy1K+aNVkQ9AHiOndtpjlGbQS X76pOhmGPwSvZWQPIg9QNjLrl2HIkegy4Q3/yh3T2km3ToFkKkdb9XOWQfF38zfsO2uM LWml3xiTLL691mn1QBt3deOMJyYF9CDFpQlbcD0XO7ZFEVVeJd/n85AyR5VYBudAgbUz lYx0VhyVmzMRFY/IVvuwfSO9Q3xJgk8MSbiaZddn42HWepGS+RUpfe7YseYt4eC+2v/r Ulbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:mail-followup-to:message-id:subject:cc:to :from:date:dkim-signature; bh=6OXAj9s43JL1nA3AEpbYM/MO9Qwg5Ya14IpN0gZo/as=; b=IdpGfOWVnm07q6tACGmdm4SklKpEkMKu5n7rk7L3WcgN8WMWJlYCBoJSJ0l5heSoIU Gdv6OmZPUXO0KnozDm9ljmYfblN8IoMz95Hf0vivUhhJrXFZ+RXGXJ9+h7T17jkNADlo OEMXukCZ1LrQ+bafux+JcqGfSUqBrXY36RaNQ78yFuG/AhVWWHw596OKpYJkc7iqu/fW 3jSHcCfUqpdBJot9BEA077gTiQcIvQ4c4bA5dCOYvSGTXzdTWSSXbkbEidr4CWY0T35i HiuNfQSoIVsyp3PPkzZgRwRpP7/BVLtIXJQ++gvUxJso6UjG0jMqlJ3qmdE8aasL/KgC DbtQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=NAXSbQeH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s23si2193565edr.605.2020.06.18.08.05.09; Thu, 18 Jun 2020 08:05:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=NAXSbQeH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730950AbgFRPBI (ORCPT + 99 others); Thu, 18 Jun 2020 11:01:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60056 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730940AbgFRPBC (ORCPT ); Thu, 18 Jun 2020 11:01:02 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7F92C06174E for ; Thu, 18 Jun 2020 08:01:01 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id q2so3969151wrv.8 for ; Thu, 18 Jun 2020 08:01:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:in-reply-to; bh=6OXAj9s43JL1nA3AEpbYM/MO9Qwg5Ya14IpN0gZo/as=; b=NAXSbQeHANLe1jipR3irzjbY7yWCggcmWa6HNg2IPmITzqFGf3L5EfLt0xDSdDrG+b x2cMxixoM50+NeYTVtrK/kNJYHNO6fr4lE9I3N0/BlBM5groWNzwqW1H+Hm//JIuCUoA dlNXlF2mmmSgTTwOzqZXKvpuzoxFA0/uYDSds= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :in-reply-to; bh=6OXAj9s43JL1nA3AEpbYM/MO9Qwg5Ya14IpN0gZo/as=; b=ZlH3p0x6X2rANe7BdNMCoqJ2BTYiS7Y6oxEA1NNJo9+b/J188KgIYZQSP+OJ7059Lb z001xgk+Gvd3LH4TGwSIjZL6Xrr1cNiQiY/HARSRsZpHtVaeLR+sHURMhPBsoLW2u327 BqxzIrnHTeMqiBI+m5+m08YeDn7/L2zdC3qr7UDTVgbw1lNJ+eX48fFi+N5emyypRd/7 xoJ4vub7estuVgXMdkfkpqbTZw8DGgm0fS7OO9JXLEzbIp81LnA9RRx0yiD6ny3s4BLn r/46POiEQD4tHTZxIfjN1/ZvkVqnmHox8Vz86/pfoXIHRJURoUeFoVL1uDIohzUCf7Q/ vtmA== X-Gm-Message-State: AOAM53314tn8Rj8jzSuq/sTsBGKgeUrX9+TcUAu4gHA+/+zw1RX8MXJE hvQKpul8DJwgeY49diIhAue66A== X-Received: by 2002:adf:d841:: with SMTP id k1mr4031245wrl.93.1592492454932; Thu, 18 Jun 2020 08:00:54 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id p7sm3878240wro.26.2020.06.18.08.00.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jun 2020 08:00:53 -0700 (PDT) Date: Thu, 18 Jun 2020 17:00:51 +0200 From: Daniel Vetter To: Jason Gunthorpe Cc: Daniel Vetter , Thomas =?iso-8859-1?Q?Hellstr=F6m_=28Intel=29?= , DRI Development , linux-rdma , Intel Graphics Development , Maarten Lankhorst , LKML , amd-gfx list , "moderated list:DMA BUFFER SHARING FRAMEWORK" , Thomas Hellstrom , Daniel Vetter , "open list:DMA BUFFER SHARING FRAMEWORK" , Christian =?iso-8859-1?Q?K=F6nig?= , Mika Kuoppala Subject: Re: [Linaro-mm-sig] [PATCH 04/18] dma-fence: prime lockdep annotations Message-ID: <20200618150051.GS20149@phenom.ffwll.local> Mail-Followup-To: Jason Gunthorpe , Thomas =?iso-8859-1?Q?Hellstr=F6m_=28Intel=29?= , DRI Development , linux-rdma , Intel Graphics Development , Maarten Lankhorst , LKML , amd-gfx list , "moderated list:DMA BUFFER SHARING FRAMEWORK" , Thomas Hellstrom , Daniel Vetter , "open list:DMA BUFFER SHARING FRAMEWORK" , Christian =?iso-8859-1?Q?K=F6nig?= , Mika Kuoppala References: <20200604081224.863494-1-daniel.vetter@ffwll.ch> <20200604081224.863494-5-daniel.vetter@ffwll.ch> <20200611083430.GD20149@phenom.ffwll.local> <20200611141515.GW6578@ziepe.ca> <20200616120719.GL20149@phenom.ffwll.local> <20200617152835.GF6578@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200617152835.GF6578@ziepe.ca> X-Operating-System: Linux phenom 5.6.0-1-amd64 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 17, 2020 at 12:28:35PM -0300, Jason Gunthorpe wrote: > On Wed, Jun 17, 2020 at 08:48:50AM +0200, Daniel Vetter wrote: > > > Now my understanding for rdma is that if you don't have hw page fault > > support, > > The RDMA ODP feature is restartable HW page faulting just like nouveau > has. The classical MR feature doesn't have this. Only mlx5 HW supports > ODP today. > > > It's only gpus (I think) which are in this awkward in-between spot > > where dynamic memory management really is much wanted, but the hw > > kinda sucks. Aside, about 10+ years ago we had a similar problem with > > gpu hw, but for security: Many gpu didn't have any kinds of page > > tables to isolate different clients from each another. drivers/gpu > > fixed this by parsing&validating what userspace submitted to make sure > > it's only every accessing its own buffers. Most gpus have become > > reasonable nowadays and do have proper per-process pagetables (gpu > > process, not the pasid stuff), but even today there's still some of > > the old model left in some of the smallest SoC. > > But I still don't understand why a dma fence is needed inside the GPU > driver itself in the notifier. > > Surely the GPU driver can block and release the notifier directly from > its own command processing channel? > > Why does this fence and all it entails need to leak out across > drivers? So 10 years ago we had this world of every gpu driver is its own bucket, nothing leaks out to the world. But the world had a different idea how gpus where supposed to work, with stuff like: - laptops with a power-efficient but slow gpu integrated on the cpu die, and a 2nd, much faster but also more wasteful gpu seperately - also multi-gpu rendering (but on linux we never really got around to enabling that, at least not for 3d rendering) - soc just bundle IP blocks together, and very often they feel like they have to do their own display block (it's fairly easy and allows you to keep your hw engineers justified on payroll with some more patents they create), but anything more fancy they buy in. So from a driver architecture pov even a single chip soc looks like a bundle of gpus And you want to pipeline all this because performance, so waiting in userspace for one block to finish before you hand it ever to the other isn't a good idea. Hence dma_fence as a cross driver leak was created by pulling the gpu completion tracking from the drm/ttm library for managing vram. Now with glorious hindsight we could have come up with a different approach, where synchronization is managed by userspace, kernel just provides some primitives (kinda like futexes, but for gpu). And the kernel manages residency and gpu pte wrangling entirely seperately. But: - 10 years ago drivers/gpu was a handful of people at best - we just finished the massive rewrite to get to a kernel memory manager and kernel modesetting (over 5 years after windows/macos), so appetite for massive rewrites was minimal. Here we are, now with 50 more drivers built on top and an entire userspace ecosystem that relies on all this (because yes we made dma_fence also the building block for all the cross-process uapi, why wouldn't we). I hope that explains a bit the history of how and why we ended up here. Maybe I should do a plumbers talk about "How not to memory manage - cautious tales from drivers/gpu" I think there's a lot of areas where the conversation usually goes "wtf" ... long explanation of history and technical reasons leading to a "oh dear". With a lot of other accelerators and things landing might be good to have a list of things that look tempting (because hey 2% faster) but arent worth the pain. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch