Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp780788pxb; Tue, 12 Apr 2022 13:16:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz0aYgCsO28pbOS75NVCHvrwSg29Y9ll5HDD+J8AJIysFXgrPvU6pBllwSCfrZGyEOKEIKg X-Received: by 2002:a17:90b:812:b0:1cb:afe4:e418 with SMTP id bk18-20020a17090b081200b001cbafe4e418mr6779378pjb.53.1649794607435; Tue, 12 Apr 2022 13:16:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649794607; cv=none; d=google.com; s=arc-20160816; b=E3rgIV6uslTRFfzPwb41npEyTbIby1HZFnvK6ISId8BTE2VIQ9BtTjlc2DzSA50a0g HJIwhrnCt7h7gOs2+wyiqGnMXOFrba83HpF0CXoaPKKBIynFfy7SLpLGO9exO4qa7qsr xl6NW1tANmkEbXj0LEzAB5mLLTppM1mCVeLreQ5B4BJREVQ38emS5OvfxjYmwsXelv1w NKtT2G6Es8MYUm8cIBA/W9Mq11NurPo/eHIu7Q4/98ehm44ZCmh/TqNbVxwnSbHNrw2r o8OkR61+9tMtI5cY7R0hWu6uG6koZquVnYu/n5bdJsEH1XZzBiVwYiAY+W/5NdnxBdfY o2hQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=avk1tmlSLxeA952K1dvhkVS+A/q75ZC7Dyz8MU7Dnbg=; b=yrns0CgWTx2M1Fh1BPEHFfCIh99jitHXiGSCDNA6Obruh96D6ZQKAf5Z51vSJ/CqMd 1iLNOrYE55SlxM1R5soJ7FVSrSku2jQsScNUXmgxN/cw7OmFMNWNHPP9WEJPqYkASdu8 V/tN2eDCNIdo1Ol11ktZkG8BUszPg2JHMa3p/jA4NkLBT4UnQk1nNC0mDAjss59+FxNf q+8d8KR69JPZE9Floe2lLkx9bgwCHyqjq3VlXqBmnG7DtqOJZOGdYa1fy5GjS84ggpmV mJjNZKBH1v+D7NzasOZlkZQH1AHUNrU9Ff9X1tfbN9Ji9w3ck7GMoROEHOq6vG9qbvyo L6Gw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=QDGr51ks; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=collabora.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id np8-20020a17090b4c4800b001c7ba889590si18119323pjb.2.2022.04.12.13.16.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Apr 2022 13:16:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=QDGr51ks; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=collabora.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 14C5C6D191; Tue, 12 Apr 2022 13:00:29 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348537AbiDLSWz (ORCPT + 99 others); Tue, 12 Apr 2022 14:22:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54216 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236948AbiDLSWx (ORCPT ); Tue, 12 Apr 2022 14:22:53 -0400 Received: from bhuna.collabora.co.uk (bhuna.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e3e3]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2C0DB4969F for ; Tue, 12 Apr 2022 11:20:35 -0700 (PDT) Received: from [IPV6:2a00:5f00:102:0:10b3:10ff:fe5d:4ec1] (unknown [IPv6:2a00:5f00:102:0:10b3:10ff:fe5d:4ec1]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: dmitry.osipenko) by bhuna.collabora.co.uk (Postfix) with ESMTPSA id AD94E1F44A65; Tue, 12 Apr 2022 19:20:32 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1649787633; bh=89nEyLETtUj6qse1DGp0rsGFjY9owVYpHqNxsQzbbW4=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=QDGr51ksXmI4Pez4M+ze9AcF+g/sAdJ+SZL2gaA3ylgszoig0/IJrP9uj05T9dIT0 MwFBfhb3F5TPRfV3ObzCq+OInxq4oPo5sjvB81gxcIiz0r/5Z0DzndxXHgxu9F7Gew 5W3k7l7L1GMN4OoJM3a8pnfsfIGdAiK7gY2TZJPObBj6sfPU20AWjRfw+BJHjTY2GA GT953VvXgw61UZPXPW2Kp7qmrx49cXR9G6gpk0FDQDQr5AX4nmzJrDcgEd2cGY26TS xMT4x+/yxUrqrJ838NWNOVRs4f3B2XFIGaJtPS6tnm4iTn1jjf6rnK2GmtbCFe+Aam oy/edFnVBz7tg== Message-ID: Date: Tue, 12 Apr 2022 21:20:29 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: [PATCH v1] drm/scheduler: Don't kill jobs in interrupt context Content-Language: en-US To: Andrey Grodzovsky , David Airlie , Daniel Vetter , Tomeu Vizoso , Steven Price , Rob Herring , Alyssa Rosenzweig , Rob Clark Cc: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Dmitry Osipenko References: <20220411221536.283312-1-dmitry.osipenko@collabora.com> <064d8958-a288-64e1-b2a4-c2302a456d5b@amd.com> From: Dmitry Osipenko In-Reply-To: <064d8958-a288-64e1-b2a4-c2302a456d5b@amd.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/12/22 19:51, Andrey Grodzovsky wrote: > > On 2022-04-11 18:15, Dmitry Osipenko wrote: >> Interrupt context can't sleep. Drivers like Panfrost and MSM are taking >> mutex when job is released, and thus, that code can sleep. This results >> into "BUG: scheduling while atomic" if locks are contented while job is >> freed. There is no good reason for releasing scheduler's jobs in IRQ >> context, hence use normal context to fix the trouble. > > > I am not sure this is the beast Idea to leave job's sw fence signalling > to be > executed in system_wq context which is prone to delays of executing > various work items from around the system. Seems better to me to leave the > fence signaling within the IRQ context and offload only the job freeing or, > maybe handle rescheduling to thread context within drivers implemention > of .free_job cb. Not really sure which is the better. We're talking here about killing jobs when driver destroys context, which doesn't feel like it needs to be a fast path. I could move the signalling into drm_sched_entity_kill_jobs_cb() and use unbound wq, but do we really need this for a slow path?