Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp1614797pxp; Thu, 17 Mar 2022 12:46:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJybI4Kxk44U4Za/LE9/iC7jyFXG+W4u1C9MzItUCrCgw9A/qA3vnMisl9EBC8O3yO7+4avb X-Received: by 2002:a17:90b:4a8d:b0:1bf:a379:938c with SMTP id lp13-20020a17090b4a8d00b001bfa379938cmr17945122pjb.129.1647546380793; Thu, 17 Mar 2022 12:46:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1647546380; cv=none; d=google.com; s=arc-20160816; b=rOXvdH9WdTGxgTzRZSS5Deov2u+hrSG16a5+DqXYPXZKE06CSn5aze7cXKvj9RJlu+ GRultVvuLIQY+sdVHawTp18lPI/BG9gxCzHutK/6bFEGlbZDhGMELyjlOtbUgP88VC3d mCoak8cgayZqDLPXjZ9nBlLUczg3awlM5HXjPT5Ks9CfQ546tHGHhoWesse1zbvd31hp VxLyK0oQrbZHp+jrCsr8WAN3AtJoYXyWrzCxAM0dI1dENbjLFc3SfJJs+V5Ar7tCQJjB S0nR6MccRxPsRf3xtVO6i4LNgkxHFeeWplZ/O4X1imGwU4O1eE5zu52vtmJOz4/dYEkJ qu6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:mail-followup-to :message-id:subject:cc:to:from:date:dkim-signature; bh=8bKHLTXLh3Prk0x7Kjt1KuDZoyhkHjZ+7FeBGH5Uapo=; b=DcPZN5a2MEGYoxy5sn87Xa3PXYNKiTf8cKfdVrZ2fO4Ze7DjUxvoHAV9SzavX/WufG o4MwKqHHOzOrZYKb4MfsdqX/2uFQ1pm9vN12e1eLR+x93C0w5J4RJkOWldo0LGHfxZbU Dlnkw5/DEdoweYanptFnBO7ZBmENdNsErA8k7wZIHX7RouVKkIdOdXFhIo83bdPqVClI crI5qXhYucupptfY+PGPMrRgcW67O33nbxkCGT4D10kAM1sXoSTs3PgAMdI/7yG+wTPP vWs9XdwnRLRDVNGLYa0mDOl038g7ZDRHEI4hx16byGimicnnKgwULtJnrhY0b8d/zrdO R98g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=KGxafnM3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id c17-20020a656191000000b003816043f0f9si2901054pgv.750.2022.03.17.12.46.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Mar 2022 12:46:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=KGxafnM3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 4437FF7F49; Thu, 17 Mar 2022 12:43:52 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236896AbiCQRaw (ORCPT + 99 others); Thu, 17 Mar 2022 13:30:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34144 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236766AbiCQRav (ORCPT ); Thu, 17 Mar 2022 13:30:51 -0400 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 892A5214FA3 for ; Thu, 17 Mar 2022 10:29:33 -0700 (PDT) Received: by mail-wr1-x441.google.com with SMTP id j17so8419593wrc.0 for ; Thu, 17 Mar 2022 10:29:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=8bKHLTXLh3Prk0x7Kjt1KuDZoyhkHjZ+7FeBGH5Uapo=; b=KGxafnM3qSYdU4CWYMi0SB6NNYPDBifAXbe2jqJL/qXULaaQ+MT9nrXnwx6FBn4e7X ESQY1KvTvkAnL8cj9hWkHb5PKlmdM5BFnaP4dgIUu6pD7+G1hEyRdZZZs8US0zhnKmEi fSiqWxuJQV5wCsL6OUaXZbDvIXAVamf91ff6U= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :content-transfer-encoding:in-reply-to; bh=8bKHLTXLh3Prk0x7Kjt1KuDZoyhkHjZ+7FeBGH5Uapo=; b=osvBNS5W/4/j0r+7xvYKrtutGXcvcI7ajz89q+dg2Bcd6aCxnf+5T5jP/NqjMbLaWG EU6dc6Nqf3gW+QSaN0xGLIjwH78dSutUaCAIDiHPRpkHED1hpAFawAOqat35jZkafSBk 8xAbcw1Klq0VL1drjkOfOCy7iIXZVgt2WjqPJSrftIduNFvnkSiqQzFi4UEyPr1R3NHd NBu7y2wVazS5baMM7vGuEoy8Ru353gZwNebuz4UfOYleI+niFUPXOEm5exSr9bXd8kUd MNJXF9ua4cxioG4Xgz5OFDrLez1ind57MH62ZtXqmW2uQhyyAlVqx2D7XbBhIUq/Zjan rFDQ== X-Gm-Message-State: AOAM533EWRfjhtRcgkmKliHehigIoAg9s03eyvlOBHRGDwaIeZW1Z7gi uX8ibf2KqiYCissy8G1c37SRyw== X-Received: by 2002:a5d:528b:0:b0:203:d928:834c with SMTP id c11-20020a5d528b000000b00203d928834cmr5178984wrv.500.1647538172004; Thu, 17 Mar 2022 10:29:32 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id c11-20020a05600c0a4b00b0037c91e085ddsm9709824wmq.40.2022.03.17.10.29.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Mar 2022 10:29:31 -0700 (PDT) Date: Thu, 17 Mar 2022 18:29:29 +0100 From: Daniel Vetter To: Christian =?iso-8859-1?Q?K=F6nig?= Cc: Rob Clark , Rob Clark , Jonathan Marek , David Airlie , freedreno , Vladimir Lypak , Abhinav Kumar , dri-devel , Bjorn Andersson , Akhil P Oommen , linux-arm-msm , Sean Paul , open list , AngeloGioacchino Del Regno Subject: Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend Message-ID: Mail-Followup-To: Christian =?iso-8859-1?Q?K=F6nig?= , Rob Clark , Rob Clark , Jonathan Marek , David Airlie , freedreno , Vladimir Lypak , Abhinav Kumar , dri-devel , Bjorn Andersson , Akhil P Oommen , linux-arm-msm , Sean Paul , open list , AngeloGioacchino Del Regno References: <20220310234611.424743-1-robdclark@gmail.com> <20220310234611.424743-3-robdclark@gmail.com> <3945551d-47d2-1974-f637-1dbc61e14702@amd.com> <865abcff-9f52-dca4-df38-b11189c739ff@amd.com> <915537e2-ac5b-ab0e-3697-2b16a9ec8f91@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <915537e2-ac5b-ab0e-3697-2b16a9ec8f91@amd.com> X-Operating-System: Linux phenom 5.10.0-8-amd64 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 17, 2022 at 05:44:57PM +0100, Christian K?nig wrote: > Am 17.03.22 um 17:18 schrieb Rob Clark: > > On Thu, Mar 17, 2022 at 9:04 AM Christian K?nig > > wrote: > > > Am 17.03.22 um 16:10 schrieb Rob Clark: > > > > [SNIP] > > > > userspace frozen != kthread frozen .. that is what this patch is > > > > trying to address, so we aren't racing between shutting down the hw > > > > and the scheduler shoveling more jobs at us. > > > Well exactly that's the problem. The scheduler is supposed to shoveling > > > more jobs at us until it is empty. > > > > > > Thinking more about it we will then keep some dma_fence instance > > > unsignaled and that is and extremely bad idea since it can lead to > > > deadlocks during suspend. > > Hmm, perhaps that is true if you need to migrate things out of vram? > > It is at least not a problem when vram is not involved. > > No, it's much wider than that. > > See what can happen is that the memory management shrinkers want to wait for > a dma_fence during suspend. > > And if you stop the scheduler they will just wait forever. > > What you need to do instead is to drain the scheduler, e.g. call > drm_sched_entity_flush() with a proper timeout for each entity you have > created. Yeah I think properly flushing the scheduler and stopping it and cutting all drivers over to that sounds like the right approach. Generally suspend shouldn't be such a critical path that this will hurt us, all the other io queues get flushed too afaik. Resume is the thing that needs to go real fast. So a patch set to move all drivers that open code the kthread_park to the right scheduler function sounds like the right idea here to me. -Daniel > > Regards, > Christian. > > > > > > So this patch here is an absolute clear NAK from my side. If amdgpu is > > > doing something similar that is a severe bug and needs to be addressed > > > somehow. > > I think amdgpu's use of kthread_park is not related to suspend, but > > didn't look too closely. > > > > And perhaps the solution for this problem is more complex in the case > > of amdgpu, I'm not super familiar with the constraints there. But I > > think it is a fine solution for integrated GPUs. > > > > BR, > > -R > > > > > Regards, > > > Christian. > > > > > > > BR, > > > > -R > > > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch