Received: by 2002:ac0:e34a:0:0:0:0:0 with SMTP id g10csp386171imn; Thu, 28 Jul 2022 03:59:33 -0700 (PDT) X-Google-Smtp-Source: AGRyM1su79gedKfrlLudpwC+beQTrCYYF3USIP8PS+Tyk/MJvk38ODOX5y5UIXhyCBR/5OJfxjDk X-Received: by 2002:a05:6402:28ca:b0:43b:5235:f325 with SMTP id ef10-20020a05640228ca00b0043b5235f325mr26450161edb.320.1659005973466; Thu, 28 Jul 2022 03:59:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659005973; cv=none; d=google.com; s=arc-20160816; b=vI96XIOvwQ/NjibblyVYHx+rEBiQBJAw6vmzSoGCy6JZA17ayk9/sx5RiQ8E3VpfhP ci5o+IoSP0jpRw56uYQCbsQayS5/hPlji8vJ86ra681W2m6pZ7mR0naMJwyq0Ku5opqk noIveoGe/brnDadBAN143uDLhYIn+cmaZ042TXMizOkpKxUP8U1mkocHyFXyaoYYUtEJ YFMU9DeXE04auZ0QaGCHA38ScZXURSsqfv77eRgvMsH3mV1s3JFy2jXl2Ec3q1ansPLp X6G/vaBnOj3OIkOrdW6KVkLGwcW2lKrnwHx3Qa0ekwnBxOB88gqrZ5dYiVqzpiFZuUKT aNpw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :organization:from:references:cc:to:content-language:subject :user-agent:mime-version:date:message-id:dkim-signature; bh=sxd/UnAJ1ExJgcjwLgPVbHNzMeKTNgFPD0K55pIlQaw=; b=JYXwJ9tcIGwuLPUGrccmlcKVkeR/iOmue5fl04CAJeIANj4WqnNFNiKYb83wPP80v6 /J3VF8lymghz/PpcJOJyfmZ/P6fvru2k+y+5iweaEOs0wnd0Zm7E7IiorWAosvTI42SF fpO0xxA4lCx5IvEGX1lv4z6plXh4OHN3QuwittMuvKTlafzKsAnIufHAm/gbPiuENhDZ qMVwvgwWBytXePl0aiKX9sUFL5diWC3Foyvaqwua22dBO4l7wOAfIS0Oc7PPARR3G6Dc QuCZGVu+Gc1jn/j/f5HSPrwhT9UJBwdrtLDURX7idZP6UCZvfccQFkKxGCaca2fy0CuJ a/4g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=LDFfJBBC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id fk2-20020a056402398200b0043c432626ffsi424282edb.5.2022.07.28.03.59.08; Thu, 28 Jul 2022 03:59:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=LDFfJBBC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235764AbiG1KLg (ORCPT + 99 others); Thu, 28 Jul 2022 06:11:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42708 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231260AbiG1KLe (ORCPT ); Thu, 28 Jul 2022 06:11:34 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 70B6E3AE7B; Thu, 28 Jul 2022 03:11:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1659003093; x=1690539093; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=abgk3vUpAQZd9qPd9c0j2BN6XVDXH3lIfe2uKYnCea4=; b=LDFfJBBCTea7L182ShLL2FDdPtGnr12+c1YqwgbbcLj8IQ7YsPGUUDxV 0h6Ueh5vunlLVHMfeFxNvinySfaXItfP+68op/kDDyfEVe9iG02+GrA0G Mh/U2S0SSWnmDD0pmGmFQTBhz3ljn7vnHBuwUmOy91v/I6bpwZ45lpUgG ws8dsVtk8khxDCOLf16liqhtcwdd077UbhZE4TbgUYY2D/I3YVFFz3wor KTFWzq5MTtOYnbbMjTL6f3drLsT+PMuY0QWz/C4KfwlUS6y2M/RbSwr/J fLFDlsF/ElYDsm5zvTHXZeSORmwpziHH86Y3ktfLNW/wKYOdyzTdLCWm7 w==; X-IronPort-AV: E=McAfee;i="6400,9594,10421"; a="314263779" X-IronPort-AV: E=Sophos;i="5.93,196,1654585200"; d="scan'208";a="314263779" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Jul 2022 03:11:33 -0700 X-IronPort-AV: E=Sophos;i="5.93,196,1654585200"; d="scan'208";a="633603135" Received: from niviojax-mobl2.ger.corp.intel.com (HELO [10.213.204.129]) ([10.213.204.129]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Jul 2022 03:11:29 -0700 Message-ID: Date: Thu, 28 Jul 2022 11:11:27 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [Intel-gfx] [PATCH v2 06/21] drm/i915/gt: Batch TLB invalidations Content-Language: en-US To: Mauro Carvalho Chehab Cc: stable@vger.kernel.org, =?UTF-8?Q?Thomas_Hellstr=c3=b6m?= , linux-media@vger.kernel.org, David Airlie , intel-gfx@lists.freedesktop.org, Lucas De Marchi , linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, =?UTF-8?Q?Christian_K=c3=b6nig?= , linaro-mm-sig@lists.linaro.org, Chris Wilson , Rodrigo Vivi , Dave Airlie , Tomas Winkler , Mauro Carvalho Chehab , Sumit Semwal , Matthew Auld References: <9f535a97f32320a213a619a30c961ba44b595453.1657800199.git.mchehab@kernel.org> <567823d5-57ba-30db-dd64-de609df4d8c5@linux.intel.com> <20220727134836.7f7b5fab@maurocar-mobl2> <20220728083232.352f80cf@maurocar-mobl2> From: Tvrtko Ursulin Organization: Intel Corporation UK Plc In-Reply-To: <20220728083232.352f80cf@maurocar-mobl2> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-5.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,HK_RANDOM_ENVFROM,HK_RANDOM_FROM, NICE_REPLY_A,RCVD_IN_DNSWL_HI,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 28/07/2022 07:32, Mauro Carvalho Chehab wrote: > On Wed, 27 Jul 2022 13:56:50 +0100 > Tvrtko Ursulin wrote: > >>> Because vma_invalidate_tlb() basically stores a TLB seqno, but the >>> actual invalidation is deferred to when the pages are unset, at >>> __i915_gem_object_unset_pages(). >>> >>> So, what happens is: >>> >>> - on VMA sync mode, the need to invalidate TLB is marked at >>> __vma_put_pages(), before VMA unbind; >>> - on async, this is deferred to happen at ppgtt_unbind_vma(), where >>> it marks the need to invalidate TLBs. >>> >>> On both cases, __i915_gem_object_unset_pages() is called later, >>> when the driver is ready to unmap the page. >> >> Sorry still not clear to me why is the patch moving marking of the need >> to invalidate (regardless if it a bit like today, or a seqno like in >> this patch) from bind to unbind? >> >> What if the seqno was stored in i915_vma_bind, where the bit is set >> today, and all the hunks which touch the unbind and evict would >> disappear from the patch. What wouldn't work in that case, if anything? > > Ah, now I see your point. > > I can't see any sense on having a sequence number at VMA bind, as the > unbind order can be different. The need of doing a full TLB invalidation > or not depends on the unbind order. Sorry yes that was stupid from me.. What I was really thinking was the approach I initially used for coalescing. Keeping the set_bit in bind and then once the code enters intel_gt_invalidate_tlbs, takes a "ticket" and waits on the mutex. Once it gets the mutex checks the ticket against the GT copy and if two invalidations have passed since it was waiting on the mutex it can immediately exit. That would seem like a minimal improvement to batch things up. But I guess it would still emit needless invalidations if there is no contention, just a stream of serialized put pages. While the approach from this patch can skip all but truly required. Okay, go for it and thanks for the explanations. Acked-by: Tvrtko Ursulin Regards, Tvrtko P.S. The last remaining "ugliness" is the 2nd call to invalidation from evict. It would be nicer if there was a single common place to do it on vma unbind but okay, I do not plan to dig into it so fine. > > The way the current algorithm works is that drm_i915_gem_object can be > created on any order, and, at unbind/evict, they receive a seqno. > > The seqno is incremented at intel_gt_invalidate_tlb(): > > void intel_gt_invalidate_tlb(struct intel_gt *gt, u32 seqno) > { > with_intel_gt_pm_if_awake(gt, wakeref) { > mutex_lock(>->tlb.invalidate_lock); > if (tlb_seqno_passed(gt, seqno)) > goto unlock; > > mmio_invalidate_full(gt); > > write_seqcount_invalidate(>->tlb.seqno); // increment seqno > > > So, let's say 3 objects were created, on this order: > > obj1 > obj2 > obj3 > > They would be unbind/evict on a different order. On that time, > the mm.tlb will be stamped with a seqno, using the number from the > last TLB flush, plus 1. > > As different threads can be used to handle TLB flushes, let's imagine > two threads (just for the sake of having an example). On such case, > what we would have is: > > seqno Thread 0 Thread 1 > > seqno=2 unbind/evict event > obj3.mm.tlb = seqno | 1 > seqno=2 unbind/evict event > obj1.mm.tlb = seqno | 1 > __i915_gem_object_unset_pages() > called for obj3, TLB flush happened, > invalidating both obj1 and obj2. > seqno += 2 > seqno=4 unbind/evict event > obj1.mm.tlb = seqno | 1 > __i915_gem_object_unset_pages() > called for obj1, don't flush. > ... > __i915_gem_object_unset_pages() called for obj2, TLB flush happened > seqno += 2 > seqno=6 > > So, basically the seqno is used to track when the object data stopped > being updated, because of an unbind/evict event, being later used by > intel_gt_invalidate_tlb() when called from __i915_gem_object_unset_pages(), > in order to check if a previous invalidation call was enough to invalidate > the object, or if a new call is needed. > > Now, if seqno is stored at bind, data can still leak, as the assumption > made by intel_gt_invalidate_tlb() that the data stopped being used at > seqno is not true anymore. > > Still, I agree that this logic is complex and should be better > documented. So, if you're now OK with this patch, I'll add the above > explanation inside a kernel-doc comment. > > Regards, > Mauro