Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752168AbbKPM4B (ORCPT ); Mon, 16 Nov 2015 07:56:01 -0500 Received: from mail.fireflyinternet.com ([87.106.93.118]:53872 "EHLO fireflyinternet.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751496AbbKPMz4 (ORCPT ); Mon, 16 Nov 2015 07:55:56 -0500 X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Date: Mon, 16 Nov 2015 12:55:37 +0000 From: Chris Wilson To: Tvrtko Ursulin Cc: Jens Axboe , intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Daniel Vetter , Eero Tamminen , "Rantala, Valtteri" , stable@kernel.vger.org Subject: Re: [PATCH 2/2] drm/i915: Limit the busy wait on requests to 2us not 10ms! Message-ID: <20151116125537.GS569@nuc-i3427.alporthouse.com> Mail-Followup-To: Chris Wilson , Tvrtko Ursulin , Jens Axboe , intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Daniel Vetter , Eero Tamminen , "Rantala, Valtteri" , stable@kernel.vger.org References: <1447594364-4206-1-git-send-email-chris@chris-wilson.co.uk> <1447594364-4206-2-git-send-email-chris@chris-wilson.co.uk> <5649AEED.9090807@linux.intel.com> <20151116111208.GQ569@nuc-i3427.alporthouse.com> <5649C728.5040109@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5649C728.5040109@linux.intel.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2900 Lines: 76 On Mon, Nov 16, 2015 at 12:08:08PM +0000, Tvrtko Ursulin wrote: > > On 16/11/15 11:12, Chris Wilson wrote: > >On Mon, Nov 16, 2015 at 10:24:45AM +0000, Tvrtko Ursulin wrote: > >>On 15/11/15 13:32, Chris Wilson wrote: > >>>+static u64 local_clock_us(unsigned *cpu) > >>>+{ > >>>+ u64 t; > >>>+ > >>>+ *cpu = get_cpu(); > >>>+ t = local_clock() >> 10; > >> > >>Needs comment I think to explicitly mention the approximation, or > >>maybe drop the _us suffix? > > > >I did consider _approx_us but thought that was overkill. A comment along > >the lines of > >/* Approximately convert ns to us - the error is less than the > > * truncation! > > */ > > And the result is not used in subsequent calculations apart from > comparing against an approximate timeout? Exactly, the timeout is fairly arbitrary and defined in the same units. That we truncate is a much bigger cause for concern in terms of spinning accurately for a definite length of time. > >>>@@ -1161,7 +1183,7 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state) > >>> if (signal_pending_state(state, current)) > >>> break; > >>> > >>>- if (time_after_eq(jiffies, timeout)) > >>>+ if (busywait_stop(timeout, cpu)) > >>> break; > >>> > >>> cpu_relax_lowlatency(); > >>> > >> > >>Otherwise looks good. Not sure what would you convert to 32-bit from > >>your follow up reply since you need us resolution? > > > >s/u64/unsigned long/ s/time_after64/time_after/ > > > >32bits of us resolution gives us 1000s before wraparound between the two > >samples. And I hope that a 1000s doesn't pass between loops. Or if it does, > >the GPU managed to complete its task. > > Now I see that you did say low bits.. yes that sounds fine. > > Btw while you are optimizing things maybe pick up this micro > optimization: http://patchwork.freedesktop.org/patch/64339/ > > Not in scope of this thread but under the normal development patch flow. There's a different series which looks at tackling the scalabiltiy issue with dozens of concurrent waiters. I have an equivalent patch there and one to tidy up the seqno query. > Btw2, any benchmark result changes with this? Spinning still gives the dramatic (2x) improvement in the microbenchmarks (over pure interrupt driven waits), so that improvement is preserved. There are a couple of interesting swings in the macro tests (comparedt to the previous jiffie patch) just above the noise level which could well be a change in the throttling/scheduling. (And those tests are also the ones that correspond to the greatest gains (10-40%) using spinning.) -Chris -- Chris Wilson, Intel Open Source Technology Centre -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/