Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp1638741pxp; Thu, 17 Mar 2022 13:19:20 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxtmARWgTebcXFxBQ5N/yFD7z37Ks7xbGu6GRPREjAC9Zznxpwopv8svfru501RzFw/7Yaz X-Received: by 2002:a05:6a00:1d1f:b0:4f7:4605:8f44 with SMTP id a31-20020a056a001d1f00b004f746058f44mr6523616pfx.58.1647548360394; Thu, 17 Mar 2022 13:19:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1647548360; cv=none; d=google.com; s=arc-20160816; b=P5EvRqcMQ9ElRIDTA0RtrQMYTOeXWRTcrfUMydJh3/K/oqyIMMWAE+sCyKtYx127Du R0RfQgKhZbrFQb4x+NsaDIYGOL0rFwj3GU2rhncVBNsDbPchX9pwHeWI8xJvsqve0wBw ZIESfuFAELXyhdwcOt4LpHZwaL9i3oOqDDiOCTiWw15003YwUlU/wqAzuDdliIm7+svc umGmaEzbZpJrdSpF51s4hVt8Kimh2xDgo8kLBKOPEOHv4ehFYb9AIKVwbrJWxfusq/v8 iS5Oy87EYr7eonlsWwxv1BM6LpflGwEgAcIl6A21DNGx1VU7w01ktty93YFOzvtM2GsD eWbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=5WOWImhAUqDCpINz5GAMHG6O7lsNY7VMT01bdVagZWc=; b=lFlBhkuOR8bRUz5QTbPgtW9GIvKdda+4+QvPmv9/fWGMNgBS6Hoj7mIW5c3fNLwPJ8 3EzQVoQok/9Pl3OH3NvI4F99FNHeeCvl0YufWNcwML8Q9kqIuPrOE8zTGYWZ+3GrMZfm 7TxkDhSVMBjt3dOOzIBWmUrULS9Bp7SbuB/NfUpJGZV0e07PU6E6z1qvOegkvgxyLMGW I4zU6RhfUK3JVQHkzMRLXXyTejav2N/moTh2z+9/9FINk1zm4bNPd7MV94lgwX1fC9TP zAxb5B+J4EfCn5KXZnhv3rdQWyDcQY1N33FG5F7mxcbNBgmrV3GUXG2EmY3SBVSUMQI3 cKPA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=aq6DxaZi; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id n12-20020a654ccc000000b003816043f159si3581194pgt.846.2022.03.17.13.19.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Mar 2022 13:19:20 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=aq6DxaZi; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 926D3160C10; Thu, 17 Mar 2022 13:00:43 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237086AbiCQS0K (ORCPT + 99 others); Thu, 17 Mar 2022 14:26:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230424AbiCQS0I (ORCPT ); Thu, 17 Mar 2022 14:26:08 -0400 Received: from mail-wr1-x430.google.com (mail-wr1-x430.google.com [IPv6:2a00:1450:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E3DA1D12E0; Thu, 17 Mar 2022 11:24:51 -0700 (PDT) Received: by mail-wr1-x430.google.com with SMTP id h23so7919923wrb.8; Thu, 17 Mar 2022 11:24:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=5WOWImhAUqDCpINz5GAMHG6O7lsNY7VMT01bdVagZWc=; b=aq6DxaZijrN5HrtIcyJc8/ndCbKMm4kjpXUP+iUT09yTZppIbKxEHZb/r7JKNq5AlX msy8fbXcZNH5bFBn0r1JrWF+4/zw+lpu9VNyrrMbKj76h1sJINSdXAeXLr5eFkKuIKt+ ysbfM666oKJRhu/VnQNVuqjDaMjzJiXrmxA7/w6/2OMVjNIaWHgjgDG1RfP3MLc17E3X yLitqncP0ZG3GC9a+4ONnVrzyerUWaCH8v92bJd8l/mkZ3aLCliELsK2aPLaj64UmYyu BUelbuhAC707tKScoxuuav+bkU0e5MPdUgiWtsIRujlGk0By8c7wccwTEuaYkfnIUX5S Qrhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=5WOWImhAUqDCpINz5GAMHG6O7lsNY7VMT01bdVagZWc=; b=cF7kjKPwBk2HEekKNutRV8TIs1YcqWaU2oszoXBR4TMpw3vSL+fZuat4sOsNPofFqL Oc7hSkyxWjSs/RA0kaog2oQQhXdo47+vibCO3e4e4H2qndvlnZm7VT1DL/wovuutXADq 41bM0WW1nL9wSkZIBc1EgusgEsBJ/9cXk6/JSpdB2ZVMoMnL104EwW7+wd+RRvebigch Rtve00PAS0VNMIna2h13Wlz0eT4KiW+1+EVfQ3KwAMfF46fjA4uyhfEdZnM80xSfSLTh RSCMllBisiMw1iYSB1LypnbC7Bwc2iPbqKd9jhI0lQMLG9+Cyko36l0XRHhQEgDydbfR mYGQ== X-Gm-Message-State: AOAM530gl8wGvNivQjQSXBnD+w7h890kXdBJSiHO+Lx7K7viNWNyW3hz dTK+7sdM9FQVrkNXaeu1/W+BN8hQEmSP5W3+XJDqc7Pe X-Received: by 2002:a05:6000:170c:b0:203:df21:742c with SMTP id n12-20020a056000170c00b00203df21742cmr5175530wrc.574.1647541489710; Thu, 17 Mar 2022 11:24:49 -0700 (PDT) MIME-Version: 1.0 References: <20220310234611.424743-1-robdclark@gmail.com> <20220310234611.424743-3-robdclark@gmail.com> <3945551d-47d2-1974-f637-1dbc61e14702@amd.com> <865abcff-9f52-dca4-df38-b11189c739ff@amd.com> <915537e2-ac5b-ab0e-3697-2b16a9ec8f91@amd.com> <3a475e5a-1090-e2f4-779c-6915fc8524b1@amd.com> In-Reply-To: <3a475e5a-1090-e2f4-779c-6915fc8524b1@amd.com> From: Rob Clark Date: Thu, 17 Mar 2022 11:25:31 -0700 Message-ID: Subject: Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend To: Andrey Grodzovsky Cc: =?UTF-8?Q?Christian_K=C3=B6nig?= , dri-devel , freedreno , linux-arm-msm , Rob Clark , Sean Paul , Abhinav Kumar , David Airlie , Akhil P Oommen , Jonathan Marek , AngeloGioacchino Del Regno , Bjorn Andersson , Vladimir Lypak , open list Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 17, 2022 at 11:10 AM Andrey Grodzovsky wrote: > > > On 2022-03-17 13:35, Rob Clark wrote: > > On Thu, Mar 17, 2022 at 9:45 AM Christian K=C3=B6nig > > wrote: > >> Am 17.03.22 um 17:18 schrieb Rob Clark: > >>> On Thu, Mar 17, 2022 at 9:04 AM Christian K=C3=B6nig > >>> wrote: > >>>> Am 17.03.22 um 16:10 schrieb Rob Clark: > >>>>> [SNIP] > >>>>> userspace frozen !=3D kthread frozen .. that is what this patch is > >>>>> trying to address, so we aren't racing between shutting down the hw > >>>>> and the scheduler shoveling more jobs at us. > >>>> Well exactly that's the problem. The scheduler is supposed to shovel= ing > >>>> more jobs at us until it is empty. > >>>> > >>>> Thinking more about it we will then keep some dma_fence instance > >>>> unsignaled and that is and extremely bad idea since it can lead to > >>>> deadlocks during suspend. > >>> Hmm, perhaps that is true if you need to migrate things out of vram? > >>> It is at least not a problem when vram is not involved. > >> No, it's much wider than that. > >> > >> See what can happen is that the memory management shrinkers want to wa= it > >> for a dma_fence during suspend. > > we don't wait on fences in shrinker, only purging or evicting things > > that are already ready. Actually, waiting on fences in shrinker path > > sounds like a pretty bad idea. > > > >> And if you stop the scheduler they will just wait forever. > >> > >> What you need to do instead is to drain the scheduler, e.g. call > >> drm_sched_entity_flush() with a proper timeout for each entity you hav= e > >> created. > > yeah, it would work to drain the scheduler.. I guess that might be the > > more portable approach as far as generic solution for suspend. > > > > BR, > > -R > > > I am not sure how this drains the scheduler ? Suppose we done the > waiting in drm_sched_entity_flush, > what prevents someone to push right away another job into the same > entity's queue right after that ? > Shouldn't we first disable further pushing of jobs into entity before we > wait for sched->job_scheduled ? > In the system suspend path, userspace processes will have already been frozen, so there should be no way to push more jobs to the scheduler, unless they are pushed from the kernel itself. We don't do that in drm/msm, but maybe you need to to move things btwn vram and system memory? But even in that case, if the # of jobs you push is bounded I guess that is ok? BR, -R