Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp5835220ybf; Thu, 5 Mar 2020 07:56:06 -0800 (PST) X-Google-Smtp-Source: ADFU+vtg8IKjzobK52wwWN5RS/Zfpas1LEnSw3L1gEhhc+5U2CRH5cVY8Q+rB4qWils96zXwPrO9 X-Received: by 2002:aca:5f87:: with SMTP id t129mr6178681oib.36.1583423766410; Thu, 05 Mar 2020 07:56:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583423766; cv=none; d=google.com; s=arc-20160816; b=nazVoRoxl9R5Po4oHHMLh2giU2xNzmtARauDsQnZKCzzLc9G0arOAZ1j7y5RZkoRNj umhPcAgxfztBXEsfxMy7pLY/LUkd7t1DOnDFjYz1vAfcLC1cRK/k5nzriLFAIY0LxA4z 91pj6E7uQmweZPWmMP5gmSUA1aDH5uFyKYzDSi+XAmuTZPNVnkj0jdeg2Ztqqx/wqodg PWZ6iK9njupIqyDHxROC1U2NQ2hFUxvLfuu8w1IzyUztuhyTWoAV+qsORMgGr/Obf5Iv FuFX0UNsTUUCxb+om3MZo0ur8yvOd5pzXZGN4tVrn2ZeM8l0LlWFph+gESWiEh1s69qa wj5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=jpgwSkp7XcrOD1X4HMaPrupSeCafE36RfxzoW5P2WRA=; b=dxXQFbIAYALl0S+ORrzW+hV7Cs3DV29pQZTqtZxJUljcZEmE/RD1b8amucQm8MTGj2 fuQ/xCakfI5UM55aGm+SV9MKTHQfPmY2FwO/aLXbeS6ChKiSspSUEyb+3RkcAwIrjN7g F5ebkwdtZ8iEkdaojoB89hO7B8QuY+x4KZtFMOb7/av5lpg5kx5bGRNVSBTnjaYrU4Fy gMdWTB4L2hZDxaH5hthCXdqiVFmXOYPcGMzZX+woi1UibVl8ERRWE0N0aIq554Gqk2v0 dhA5AiUkq6SPT7uEhiuku7R0aUw4x2J4iWiof7yZVWxS6FhuIVPZ79fM5iqYczt74orq 4mdg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@jlekstrand-net.20150623.gappssmtp.com header.s=20150623 header.b=DiAuYIkH; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x7si3698990otp.114.2020.03.05.07.55.53; Thu, 05 Mar 2020 07:56:06 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@jlekstrand-net.20150623.gappssmtp.com header.s=20150623 header.b=DiAuYIkH; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727173AbgCEPyb (ORCPT + 99 others); Thu, 5 Mar 2020 10:54:31 -0500 Received: from mail-ed1-f68.google.com ([209.85.208.68]:38874 "EHLO mail-ed1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727141AbgCEPyb (ORCPT ); Thu, 5 Mar 2020 10:54:31 -0500 Received: by mail-ed1-f68.google.com with SMTP id e25so7376323edq.5 for ; Thu, 05 Mar 2020 07:54:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jlekstrand-net.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=jpgwSkp7XcrOD1X4HMaPrupSeCafE36RfxzoW5P2WRA=; b=DiAuYIkHTZXOVb9HWj0I5RfNbwb2gRoDsioZ3DGYGgGwpuHWBI6imJQcLeV/8vMoqx vLpKZYWVuc/WzCc5hrpyImzqk89IyeFPjEb+xaiDUdzUc3L+4vP0GxfMfWHAa5225inH bhqKp8vTYWxjL8QTRbdt0RawrERVG6l9iq4VKC12IMs05TN+MQMi7cEdi0ZODqu3zAmn l87bl6TItmmR7WIHUwFPQog5XZ0yhYkDSymk8Nmfar54q6ikBv7NtkZMuLgKjzmfUNuR ciEIB+WF2eGUFEjO13Md4l7m4l+Ku0SUEk5+AiVw2jj5FK33lWzc593ZRpUii2yNSxRP 8p1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=jpgwSkp7XcrOD1X4HMaPrupSeCafE36RfxzoW5P2WRA=; b=EYAcgyxEYwZsRGZUAkjnIUXQlsx1uxD16SrrrkDrBHHgjwZ/GkviAi5VaqIKLATKsT k3rKfQM1+ztBdPVxtaV+MCn4dXz/ak3B4o5WzXnzg9GvU02xuiGCzcVOl/r0DwqOjED1 kLAyJZmm2WkO59brVG6DHQallk8qdMeMBawbkLfEWiARJkBWegNrie3hNzqqSYEQIVFT dshaZkPKPCpZtamZUP8SCV/osARQsz+wT6LKTdKHLnLGqG3dbYmI9dYyR3/7twq2Xlpr HfjBQA4heh8MQY/k21jw2928MAEQ51JWR/G5tRSxmJKhj1oUt75ZkNibOoA9tcAr8uLT ruGQ== X-Gm-Message-State: ANhLgQ2MrZumCwL1FlFq2zcb7cVdJGRNZL1hsliE1laqoNU16H/+h1KY jBqsKLfTPhYHkHb0YgpMwbajCUM330+RLlw0oIloZg== X-Received: by 2002:a05:6402:1655:: with SMTP id s21mr9282877edx.324.1583423668000; Thu, 05 Mar 2020 07:54:28 -0800 (PST) MIME-Version: 1.0 References: <20200225235856.975366-1-jason@jlekstrand.net> <8066d8b2-dd6a-10ef-a7bb-2c18a0661912@amd.com> <20200226100523.GQ2363188@phenom.ffwll.local> <810a26e7-4294-a615-b7ee-18148ac70641@amd.com> <21aeacc0-f3ae-c5dd-66df-4d2f3d73f73e@amd.com> <6fb8becf-9e6b-f59e-9c22-2b20069241a7@amd.com> In-Reply-To: <6fb8becf-9e6b-f59e-9c22-2b20069241a7@amd.com> From: Jason Ekstrand Date: Thu, 5 Mar 2020 09:54:16 -0600 Message-ID: Subject: Re: [PATCH] RFC: dma-buf: Add an API for importing and exporting sync files To: =?UTF-8?Q?Christian_K=C3=B6nig?= Cc: Bas Nieuwenhuizen , Dave Airlie , Jesse Hall , James Jones , Daniel Stone , =?UTF-8?Q?Kristian_H=C3=B8gsberg?= , Sumit Semwal , Chenbo Feng , Greg Hackmann , linux-media@vger.kernel.org, Maling list - DRI developers , linaro-mm-sig@lists.linaro.org, LKML , Daniel Vetter Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 5, 2020 at 7:06 AM Christian K=C3=B6nig wrote: > > Am 04.03.20 um 17:41 schrieb Jason Ekstrand: > > On Wed, Mar 4, 2020 at 10:27 AM Jason Ekstrand w= rote: > >> On Wed, Mar 4, 2020 at 2:34 AM Christian K=C3=B6nig wrote: > >>> Am 03.03.20 um 20:10 schrieb Jason Ekstrand: > >>>> On Thu, Feb 27, 2020 at 2:28 AM Christian K=C3=B6nig > >>>> wrote: > >>>> [SNIP] > >>> For reference see what dance is necessary in the dma_fence_chain_rele= ase > >>> function to avoid that: > >>>> /* Manually unlink the chain as much as possible to avoid > >>>> recursion > >>>> * and potential stack overflow. > >>>> */ > >>>> while ((prev =3D rcu_dereference_protected(chain->prev, tru= e))) { > >>> .... > >>> > >>> It took me quite a while to figure out how to do this without causing > >>> issues. But I don't see how this would be possible for dma_fence_arra= y. > >> Ah, I see the issue now! It hadn't even occurred to me that userspace > >> could use this to build up an infinite recursion chain. That's nasty! > > Yeah, when I first stumbled over it it was like why the heck is my code > crashing in an interrupt handler? > > Realizing that this is stack corruption because of the long chain we > constructed was quite an enlightenment. > > And then it took me even longer to fix it :) Fun.... > >> I'll give this some more thought and see if can come up with > >> something clever. > >> > >> Here's one thought: We could make dma_fence_array automatically > >> collapse any arrays it references and instead directly reference their > >> fences. This way, no matter how much the client chains things, they > >> will never get more than one dma_fence_array. Of course, the > >> difficulty here (answering my own question) comes if they ping-pong > >> back-and-forth between something which constructs a dma_fence_array > >> and something which constructs a dma_fence_chain to get > >> array-of-chain-of-array-of-chain-of-... More thought needed. > > Condensing the fences into a larger array can certainly work, yes. > > > Answering my own questions again... I think the > > array-of-chain-of-array case is also solvable. > > > > For array-of-chain, we can simply add all unsignaled dma_fences in the > > chain to the array. The array won't signal until all of them have > > which is exactly the same behavior as if we'd added the chain itself. > > Yeah, that should work. Probably best to implement something like a > cursor to walk all fences in the data structure. > > > For chain-of-array, we can add all unsignaled dma_fences in the array > > to the same point in the chain. There may be some fiddling with the > > chain numbering required here but I think we can get it so the chain > > won't signal until everything in the array has signaled and we get the > > same behavior as if we'd added the dma_fence_array to the chain. > > Well as far as I can see this won't work because it would break the > semantics of the timeline sync. I'm not 100% convinced it has to. We already have support for the seqno regressing and we ensure that we still wait for all the fences. I thought maybe we could use that but I haven't spent enough time looking at the details to be sure. I may be missing something. > But I think I know a different way which should work: A dma_fence_chain > can still contain a dma_fence_array, only the other way around is > forbidden. Then we create the cursor functionality in such a way that it > allows us to deep dive into the data structure and return all containing > fences one by one. Agreed. As long as one container is able to consume the other, it's fine. > I can prototype that if you want, shouldn't be more than a few hours of > hacking anyway. If you'd like to, go for it. I'd be happy to give it a go as well but if you already know what you want, it may be easier for you to just write the patch for the cursor. Two more questions: 1. Do you want this collapsing to happen every time we create a dma_fence_array or should it be a special entrypoint? Collapsing all the time likely means doing extra array calculations instead of the dma_fence_array taking ownership of the array that's passed in. My gut says that cost is ok; but my gut doesn't spend much time in kernel space. 2. When we do the collapsing, should we call dma_fence_is_signaled() to avoid adding signaled fences to the array? It seems like avoiding adding references to fences that are already signaled would let the kernel clean them up faster and reduce the likelihood that a fence will hang around forever because it keeps getting added to arrays with other unsignaled fences. --Jason