Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B0B6C64EC4 for ; Thu, 9 Mar 2023 06:30:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230002AbjCIGae (ORCPT ); Thu, 9 Mar 2023 01:30:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57976 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229716AbjCIGaa (ORCPT ); Thu, 9 Mar 2023 01:30:30 -0500 Received: from mail.marcansoft.com (marcansoft.com [212.63.210.85]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 32C7F8C0D5; Wed, 8 Mar 2023 22:30:28 -0800 (PST) Received: from [127.0.0.1] (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) (Authenticated sender: lina@asahilina.net) by mail.marcansoft.com (Postfix) with ESMTPSA id 346EA42037; Thu, 9 Mar 2023 06:30:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=asahilina.net; s=default; t=1678343426; bh=/hhFWpC+hto8GAX0pJh+P6OQkNm/8ab+JjGGIX4mhlg=; h=Date:Subject:To:Cc:References:From:In-Reply-To; b=P6/7ZSEeWqAst+e/b9Q/j9CQZQ/8ZVEuqr7EW67BxjeMhzgYAVhb4Lqg2rGGw6mMN K7gyGOyxcEa8LvgGCcfexFaGyA7NDqDRv5eqrCHYT0dosQP1+bQzr+vP++WmmQq/6A 3h70p1rPUjAoUog3DU7vldubJmJNBoMg16jPo3ft9xgaQ1DrcAiKF1gdnxpzmvafCw BYr6a2c9/eo8t/Rj5u8+VJnAggjM4/RxHoRfhH9vTwfbYazvHAA+qEEPPh5fUmNuob ul2RrpYuijhMUjrlwH6W44l0HYbiVNcAohS/tkdvPlupsDS10YITpFEENn0d2DFDk5 ZQ+n7HgLXEaEA== Message-ID: <3e5e0120-50fd-51c0-d817-5b1dc4c14e97@asahilina.net> Date: Thu, 9 Mar 2023 15:30:17 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: [PATCH RFC 10/18] drm/scheduler: Add can_run_job callback Content-Language: en-US To: =?UTF-8?Q?Christian_K=c3=b6nig?= , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Daniel Vetter , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho , Boqun Feng , Gary Guo , =?UTF-8?Q?Bj=c3=b6rn_Roy_Baron?= , Sumit Semwal , Luben Tuikov , Jarkko Sakkinen , Dave Hansen Cc: Alyssa Rosenzweig , Karol Herbst , Ella Stanforth , Faith Ekstrand , Mary , linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, rust-for-linux@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, linux-sgx@vger.kernel.org, asahi@lists.linux.dev References: <20230307-rust-drm-v1-0-917ff5bc80a8@asahilina.net> <20230307-rust-drm-v1-10-917ff5bc80a8@asahilina.net> <2b1060e9-86ba-7e16-14f1-5b5fa63de719@amd.com> <9f76bb68-b462-b138-d0ad-d27c972530d4@asahilina.net> <4bbfc1a3-cfc3-87f4-897b-b6637bac3bd0@asahilina.net> <9c3dc2ad-11e4-6004-7230-8ca752e3d9f7@asahilina.net> From: Asahi Lina In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/03/2023 05.14, Christian König wrote: >> I think you mean wake_up_interruptible(). That would be >> drm_sched_job_done(), on the fence callback when a job completes, which >> as I keep saying is the same logic used for >> hw_rq_count/hw_submission_limit tracking. > > As the documentation to wait_event says: > >  * wake_up() has to be called after changing any variable that could >  * change the result of the wait condition. > > So what you essentially try to do here is to skip that and say > drm_sched_job_done() would call that anyway, but when you read any > variable to determine that state then as far as I can see nothing is > guarantying that order. The driver needs to guarantee that any changes to that state precede a job completion fence signal of course, that's the entire idea of the API. It's supposed to represent a check for per-scheduler (or more specific, but not more global) resources that are released on job completion. Of course if you misuse the API you could cause a problem, but what I'm trying to say is that the API as designed and when used as intended does work properly. Put another way: job completions always need to cause the sched main loop to run an iteration anyway (otherwise we wouldn't make forward progress), and job completions are exactly the signal that the can_run_job() condition may have changed. > The only other possibility how you could use the callback correctly > would be to call drm_fence_is_signaled() to query the state of your hw > submission from the same fence which is then signaled. But then the > question is once more why you don't give that fence directly to the > scheduler? But the driver is supposed to guarantee that the ordering is always 1. resources freed, 2. fence signaled. So you don't need to check for the fence, you can just check for the resource state. If the callback returns false then by definition the fence wasn't yet signaled at some point during its execution (because the resources weren't yet freed), and since it would be in the wait_event_interruptible() check path, by definition the fence signaling at any point during or after the check would cause the thread to wake up again and re-check. Thread 1 Thread 2 1. wait_event_interruptible() arms wq 1. Free resources 2. can_run_job() checks resources 2. Signal fence 3. wait_event_interruptible() sleeps on wq 3. Fence wakes up wq 4. loop There is no possible interleaving of those sequences that leads to a lost event and the thread not waking up: - If T2.3 happens before T1.1, that means T2.1 happened earlier and T1.2 must return true. - If T2.3 happens after T1.1 but before T1.3, the wq code will ensure the wq does not sleep (or immediately wakes up) at T1.3 since it was signaled during the condition check, after the wq was armed. At the next check loop, T1.2 will then return true, since T2.1 already happened before T2.3. - If T2.3 happens during T1.3, the wq wakes up normally and does another check, and at that point T1.2 returns true. QED. ~~ Lina