Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp3874120rdb; Thu, 14 Sep 2023 05:34:31 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFiyAKTthWaDsq9j9tXNf4e008wwm97/3T7/N5sqrWjdTl4TlAsYWjdSn8F/M9fw1XsZeOl X-Received: by 2002:a05:6a00:1505:b0:68f:da2a:6370 with SMTP id q5-20020a056a00150500b0068fda2a6370mr6189039pfu.13.1694694871084; Thu, 14 Sep 2023 05:34:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694694871; cv=none; d=google.com; s=arc-20160816; b=B5lx4PCp/oDaMKQ/gYojN+AoKH0aYuwVDrqqAGcfG7o2m7nRzyvkHiV1XLVsJ2e/xs psvUXtL1WV4NoazUg/TN/DTm5QZC/P7tbJtbZM5y18zqyIQzkXL1FHPWSIjSw1YsueW3 71JVWoeiDaQk7NsGriknv8Ht9l/YYaCAl8aRKD+UxiBGeymLrq4RO4Egf5ub2HigU1QZ XiGIWdBOaOc92FYtXbjYsTizZl8DbXyoKAaChMWUNuM5h0FARMqOGIC/K6b0CWY3W//S NpVp/s3m9kaWM1qS6wz1O/rVGwkN8IqoKwzcpq5rN9EcahzGKbtvpj2b/1cFSkqh8m6P ub6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date:dkim-signature; bh=cJ0xbyh4lm9MIuiqCg3mC83oncme7TB1212H8USygpY=; fh=iUS413dVenIF5zzUEj/JAYSspxKEUmfTSWp8twPe5Io=; b=bOk9yx/xcSgq1xfjUP0m0KSXDSNVvdi+R7WIiklNOAOpvCg63NpU+8McmCgw0QGpIr 494KrqfgqpzfVAYLMenq2pfRSklN9jc64PKk6WyB3w2enVxKRI8yMQHUe6MWOklnydAT tlljJd1DiPSH0yiPK9gWWHXu0cSCdR4q6GnEbnO9fPW9kN3aos3M05FQEGSW6uD4hdky Kyrdsl7/I2dRY4waVO+atU03QNgESimrkyT1WvvpnEYFiCMakscpW7UvbfRNAyJissIh 3TaDiFcxGBmEDdQZAgprxeCi1DZ3A9c+SUkOje8s3Vnd/FBwvYuPHZHI0cQx24navWp4 0xQg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=m2FQgzES; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Return-Path: Received: from howler.vger.email (howler.vger.email. [2620:137:e000::3:4]) by mx.google.com with ESMTPS id t22-20020a63eb16000000b005649cee422esi1425488pgh.464.2023.09.14.05.34.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 05:34:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) client-ip=2620:137:e000::3:4; Authentication-Results: mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=m2FQgzES; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 23D0B8070674; Thu, 14 Sep 2023 01:20:19 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235980AbjINIUU (ORCPT + 99 others); Thu, 14 Sep 2023 04:20:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40152 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230413AbjINIUT (ORCPT ); Thu, 14 Sep 2023 04:20:19 -0400 Received: from madras.collabora.co.uk (madras.collabora.co.uk [46.235.227.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3DF601BB for ; Thu, 14 Sep 2023 01:20:15 -0700 (PDT) Received: from localhost (unknown [IPv6:2a01:e0a:2c:6930:5cf4:84a1:2763:fe0d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by madras.collabora.co.uk (Postfix) with ESMTPSA id 924926607342; Thu, 14 Sep 2023 09:20:13 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1694679614; bh=XmOy1TN76cj9cvodR85I1aqGC2AUgCfgcCmXyrTjSXA=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=m2FQgzEStPR1ikv5QUMyynbn5VwhhE8Iof/lbEctTJJ104xbSPIoa9JFvNW4lPjKx 0VGLHMLiWjZDg+bX83BGo4ch5mOOUolXa/3JLvSb8PF4XErwAPf1FQRxD/JQr+qkcg UGsoXTykpkmQ2Bf4yUi8YN5UPplmGYytmNwY5j5b3ZoZdWfYgk+3MI41UL7hMp06mv ba5N7jwLPqBcchDTq0VoPwGEWpmY7by/tDrcu5OLrpQJQ+n+DSeern1f2YyQtB3EAW xmHMtwPmma+4q/cnsH8gDeZ6d61ZmLEF+krRPfD3B0zNbT+qhxbJ7ePg1k4vrQnQ2U pafzFaDrC1q6w== Date: Thu, 14 Sep 2023 10:20:10 +0200 From: Boris Brezillon To: Thomas =?UTF-8?B?SGVsbHN0csO2bQ==?= Cc: Dave Airlie , Danilo Krummrich , daniel@ffwll.ch, matthew.brost@intel.com, sarah.walker@imgtec.com, donald.robson@imgtec.com, christian.koenig@amd.com, faith.ekstrand@collabora.com, dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation Message-ID: <20230914102010.628ea823@collabora.com> In-Reply-To: <6f4c047d-0e6d-c45b-3092-cd0bc84849dc@linux.intel.com> References: <20230909153125.30032-1-dakr@redhat.com> <20230909153125.30032-7-dakr@redhat.com> <20230913090311.5eeb026a@collabora.com> <20230913091918.62c06a30@collabora.com> <20230913133318.15edec7c@collabora.com> <6f4c047d-0e6d-c45b-3092-cd0bc84849dc@linux.intel.com> Organization: Collabora X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Thu, 14 Sep 2023 01:20:19 -0700 (PDT) On Wed, 13 Sep 2023 15:22:56 +0200 Thomas Hellstr=C3=B6m wrote: > On 9/13/23 13:33, Boris Brezillon wrote: > > On Wed, 13 Sep 2023 12:39:01 +0200 > > Thomas Hellstr=C3=B6m wrote: > > =20 > >> Hi, > >> > >> On 9/13/23 09:19, Boris Brezillon wrote: =20 > >>> On Wed, 13 Sep 2023 17:05:42 +1000 > >>> Dave Airlie wrote: > >>> =20 > >>>> On Wed, 13 Sept 2023 at 17:03, Boris Brezillon > >>>> wrote: =20 > >>>>> On Tue, 12 Sep 2023 18:20:32 +0200 > >>>>> Thomas Hellstr=C3=B6m wrote: > >>>>> =20 > >>>>>>> +/** > >>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element > >>>>>>> + * @__gpuvm: The GPU VM > >>>>>>> + * @__list_name: The name of the list we're iterating on > >>>>>>> + * @__local_list: A pointer to the local list used to store alre= ady iterated items > >>>>>>> + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get= _next_cached_vm_bo() > >>>>>>> + * > >>>>>>> + * This helper is here to provide lockless list iteration. Lockl= ess as in, the > >>>>>>> + * iterator releases the lock immediately after picking the firs= t element from > >>>>>>> + * the list, so list insertion deletion can happen concurrently.= =20 > >>>>>> Are the list spinlocks needed for that async state update from wit= hin > >>>>>> the dma-fence critical section we've discussed previously? =20 > >>>>> Any driver calling _[un]link() from its drm_gpu_scheduler::run_job() > >>>>> hook will be in this situation (Panthor at the moment, PowerVR soon= ). I > >>>>> get that Xe and Nouveau don't need that because they update the VM > >>>>> state early (in the ioctl path), but I keep thinking this will hurt= us > >>>>> if we don't think it through from the beginning, because once you've > >>>>> set this logic to depend only on resv locks, it will be pretty hard= to > >>>>> get back to a solution which lets synchronous VM_BINDs take precede= nce > >>>>> on asynchronous request, and, with vkQueueBindSparse() passing exte= rnal > >>>>> deps (plus the fact the VM_BIND queue might be pretty deep), it can > >>>>> take a long time to get your synchronous VM_BIND executed... =20 > >> So this would boil down to either (possibly opt-in) keeping the spinlo= ck > >> approach or pushing the unlink out to a wq then? =20 > > Deferred _unlink() would not be an issue, since I already defer the > > drm_gpuva destruction to a wq, it would just a be a matter of moving the > > _unlink() call there as well. But _link() also takes the GEM gpuva list > > lock, and that one is bit tricky, in that sm_map() can trigger 2 more > > _link() calls for the prev/next mappings, which we can't guess until we > > get to execute the VM update. If we mandate the use of the GEM resv > > lock, that simply means async VM updates (AKA calling > > drm_gpuvm_sm_[un]map()) are not an option. And if this is what everyone > > agrees on, then I'd like the APIs that make this sort of async VM > > update possible (drm_gpuvm_sm_[un]map(), the drm_gpuvm_ops::sm_step* > > methods, and probably other things) to be dropped, so we don't make it > > look like it's something we support. > > =20 > >> BTW, as also asked in a reply to Danilo, how do you call unlink from > >> run_job() when it was requiring the obj->dma_resv lock, or was that a = WIP? =20 > > _unlink() makes sure the GEM gpuva list lock is taken, but this can be > > a custom lock (see drm_gem_gpuva_set_lock()). In panthor we have > > panthor_gem_object::gpuva_list_lock that's dedicated the gpuva list > > protection. We make sure we never take this lock while allocating > > memory to guarantee the dma-signalling path can't deadlock. > > =20 > >>>>> =20 > >>>> btw what is the use case for this? do we have actual vulkan > >>>> applications we know will have problems here? =20 > >>> I don't, but I think that's a concern Faith raised at some point (dat= es > >>> back from when I was reading threads describing how VM_BIND on i915 > >>> should work, and I was clearly discovering this whole VM_BIND thing at > >>> that time, so maybe I misunderstood). > >>> =20 > >>>> it feels like a bit of premature optimisation, but maybe we have use= cases. =20 > >>> Might be, but that's the sort of thing that would put us in a corner = if > >>> we don't have a plan for when the needs arise. Besides, if we don't > >>> want to support that case because it's too complicated, I'd recommend > >>> dropping all the drm_gpuvm APIs that let people think this mode is > >>> valid/supported (map/remap/unmap hooks in drm_gpuvm_ops, > >>> drm_gpuvm_sm_[un]map helpers, etc). Keeping them around just adds to = the > >>> confusion. =20 > >> Xe allows bypassing the bind-queue with another bind-queue, but to > >> completely avoid dependencies between queues the Operations may not > >> overlap. =20 > > So, you check the VM state with some VM lock held (would be the VM resv > > in my case), and if the mapping is new (no overlaps with pre-existing > > mappings), you queue it to the fast-track/sync-VM_BIND queue. What would > > be missing I guess is a way to know if the mapping is active (MMU has > > been updated) or pending (MMU update queued to the bind-queue), so I can > > fast-track mapping/unmapping of active mappings. Ok, so I started modifying the implementation, and quickly realized the overlap test can't be done without your xe_range_fence tree because of unmaps. Since we call drm_gpuva_unmap() early/in the IOCTL path (IOW, before the mapping teardown is effective), we lose track of this yet-to-be-executed-unmap operation, and if we do our va_range_overlaps_with_existing_mappings() test after such an unmap has been queued using just the drm_gpuvm tree, we might get false even if the mapping still exists and is expected to be torn down when the VM_BIND(unmap) job is executed on the bind-queue. As a result, this might execute the VM_BIND(map,sync) immediately (because the dependency went undetected), and then the vm_bind_run_job() function kicks in and undoes what the synchronous VM_BIND(map) did. Am I missing something? If I'm correct, that means I'm back to having synchronous VM_BIND ops queued after all asynchronous ones unless I use something like your xe_range_fence solution (which I was hoping I could postpone until we decide to expose multiple bind queues). I'm still a bit skeptical about this 'update VM mappings tree early, defer MMU page table updates' approach, where the VM state and the actual page table tree are temporarily out of sync until all operations have been flushed on all queues targeting a VM. This means any test we do on the gpuvm, like, 'give me the BO mapped at VA xxx', is subject to 'is this the current state or the future state?' questioning. Note that we can't even get the current VM state anymore, because all the drm_gpuvm::tree stores with this solution is the future state, and to-be-unmapped mappings are lost during the transitioning period (when vm_bind jobs are queued but not executed yet). > > This would leave > > overlapping sync/async VM updates, which can't happen in practice > > unless userspace is doing something wrong (sparse bindings always go > > through vkQueueBindSparse). =20 >=20 > User-space is allowed to create new bind queues at will, and they=20 > execute independently save for range overlaps. >=20 > And the overlapping granularity depends very much on the detail of the=20 > range tracking. > We drafted this fenced range utility > > https://gitlab.freedesktop.org/drm/xe/kernel/-/merge_requests/353 I'll try to see if there's a way we can have something generic shared at the gpuvm level.