Received: by 2002:a05:7412:b130:b0:e2:908c:2ebd with SMTP id az48csp592328rdb; Fri, 17 Nov 2023 07:24:40 -0800 (PST) X-Google-Smtp-Source: AGHT+IGTcZ3hB6dVq4O/il91GDSrXt57nmU8yt5UBtVY3vpJo1AMtRPYzhvNY19bwAEWefD/sAig X-Received: by 2002:a05:6a21:1c8b:b0:187:652d:95b5 with SMTP id sf11-20020a056a211c8b00b00187652d95b5mr6466938pzb.62.1700234680467; Fri, 17 Nov 2023 07:24:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700234680; cv=none; d=google.com; s=arc-20160816; b=1EFoh7qgCPlCHVMEizTN5plu9iZhLRlKAVT0r3SSHQ9Ibh7yd4RmiqfSIRmNDMAd4P W4/xEv39pCwYb4WWlkYS+Dsr8p3rU+SyPA7ctqiDrgbzjFdbug/XKqX2qGXYm6bqJ5y5 eeMzn7Gl7pH58Sx6v4wwkO0POlbi67pNdoONGjdiMeshJ9rzd9Auy2DRgYQo1y8IbHP5 KKy6tAm2vqv7shvjHo0RXudAkmarjJMrOLRaElORtSi9CoIwuyyYT96wduW48wGrbfTI 0ggnuzOv22p2iauQPDMq+MhNHomoBrXUCaAiubC906A4HETeUIh/oRx06B/H20m8Aq2B /uIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=uBvEQIGbC4HURF/ntUyUlcav9b00XtnohKM7H7UZq1s=; fh=D+EPs4f1tbcirM+xOcOZ25ESgeHsz4zHs+gyKhf0s0I=; b=jP5HCpeHqmYd4pHVLMqBKmR1nresUjPQVTiVUrgxfHj6rBTRhrYy9bV5uZqT479K7W zP7XwCjvT5NXVlvQ07Nv776d2yxlWOJKWbSCrQ3xzdeKYHIe/pomcakyaKMoKm0jQrEh 6yQ/AjIdmR2ucuhxZCTcZvL3+dP+VyvyipHeEezx2FBwCdPTb0hoqqg9CLQ/XKzWmFNX 2fx1Dt31cBD9nxLctxiLD7sVxJnUMRpJjayBmGRJAbBpNiZKIXM2XR96OZVeJ9dZyIDL vE809GIyolNnHhef/P3hJP89Gu9BAPhsfYW/bXVtMe1VFblOMGBlEKQRK691KCORHo2w 7dig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b="OaFko0/q"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id s34-20020a056a0017a200b006c6a9c10e15si2143346pfg.46.2023.11.17.07.24.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 Nov 2023 07:24:40 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b="OaFko0/q"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 3CEE08273314; Fri, 17 Nov 2023 07:24:39 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231513AbjKQPYg (ORCPT + 99 others); Fri, 17 Nov 2023 10:24:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55838 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230379AbjKQPYf (ORCPT ); Fri, 17 Nov 2023 10:24:35 -0500 Received: from mail-pl1-x62b.google.com (mail-pl1-x62b.google.com [IPv6:2607:f8b0:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4795EA6; Fri, 17 Nov 2023 07:24:32 -0800 (PST) Received: by mail-pl1-x62b.google.com with SMTP id d9443c01a7336-1cc316ccc38so18961505ad.1; Fri, 17 Nov 2023 07:24:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700234672; x=1700839472; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=uBvEQIGbC4HURF/ntUyUlcav9b00XtnohKM7H7UZq1s=; b=OaFko0/qtrAFxBOJmfA0XFJ32AcaZ5bQvg8MUpcvjBYE8zEnHS0jZWAZ2wa/IXrLst e+1fgoOotAPqD/W1mQLqUmyvJY3TPg95nIn85iNj6/50ptRjlb26IXY+FHoQDhnJMoIc oQ1ryjkOK/Z8cdxj48G0T+sPO6A5DGIlxlGDDX0kZaVa7WNiO5ZVzcDiXPYWpeYo03Pg vYsTmxesQDdZuWdiAqyCaBS7RfaN3BLG/nUPAq2qDQJ5sDwCngfVQ2D9m4AHFTyG2YUm 96hJYdvpbhdMljRUcQILB+4upC1LYvtWL7nr/hlrwDzpVqJmHf2dJWMzEkl6Fxq1Ef4t RYhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700234672; x=1700839472; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=uBvEQIGbC4HURF/ntUyUlcav9b00XtnohKM7H7UZq1s=; b=UkgvfHVbSn7g3W1Nae9hDq/axGQbJbfo0L1+9POSeWT2Q4teqF5h2bPevfWXGv0FUF U/XzcNxeB9wjexERjTWg5b48gxFXCOg/MaqotJ948CLgTlwWHCYts2Eylo7EYQtdTjQa r8hZZ4I8bUKkPHr55uUIBavFFW3wFfIthvuoHGZ7TXikmFZKJGKDC8hfJ7A5UCDwd6Q2 HQ8HvihsgAaGdznf8HE20Ah/I8RzjJfCxCbquSg90cOiJJbYx4lJYohwQSQiTeSD9OEm XaznHL+iHi3bwARzBhyr3Z61FIT2Oh9tJZwu+/LcImIYH7Pls25vnqpQLN+RUtd2QcXy Em4g== X-Gm-Message-State: AOJu0Yzza17JwhK0g82NKmFRCOqnrhYkFcZF/bK2dsup1JYbs3ag487e cnT0g3se5zEn+WRiyNPjVAQ= X-Received: by 2002:a17:902:e5c3:b0:1ce:5b6d:e6b1 with SMTP id u3-20020a170902e5c300b001ce5b6de6b1mr3482012plf.17.1700234671579; Fri, 17 Nov 2023 07:24:31 -0800 (PST) Received: from localhost ([2a00:79e1:2e00:1301:e1c5:6354:b45d:8ffc]) by smtp.gmail.com with ESMTPSA id d12-20020a170902cecc00b001c73f3a9b88sm1504162plg.110.2023.11.17.07.24.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 Nov 2023 07:24:31 -0800 (PST) From: Rob Clark To: dri-devel@lists.freedesktop.org Cc: freedreno@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, Rob Clark , Rob Clark , Abhinav Kumar , Dmitry Baryshkov , Sean Paul , Marijn Suijten , David Airlie , Daniel Vetter , linux-kernel@vger.kernel.org (open list) Subject: [PATCH] drm/msm/gpu: Skip retired submits in recover worker Date: Fri, 17 Nov 2023 07:24:28 -0800 Message-ID: <20231117152428.367592-1-robdclark@gmail.com> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Fri, 17 Nov 2023 07:24:39 -0800 (PST) From: Rob Clark If we somehow raced with submit retiring, either while waiting for worker to have a chance to run or acquiring the gpu lock, then the recover worker should just bail. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/msm_gpu.c | 41 +++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 3fad5d58262f..fd3dceed86f8 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -365,29 +365,31 @@ static void recover_worker(struct kthread_work *work) DRM_DEV_ERROR(dev->dev, "%s: hangcheck recover!\n", gpu->name); submit = find_submit(cur_ring, cur_ring->memptrs->fence + 1); - if (submit) { - /* Increment the fault counts */ - submit->queue->faults++; - if (submit->aspace) - submit->aspace->faults++; - get_comm_cmdline(submit, &comm, &cmd); + /* + * If the submit retired while we were waiting for the worker to run, + * or waiting to acquire the gpu lock, then nothing more to do. + */ + if (!submit) + goto out_unlock; - if (comm && cmd) { - DRM_DEV_ERROR(dev->dev, "%s: offending task: %s (%s)\n", - gpu->name, comm, cmd); + /* Increment the fault counts */ + submit->queue->faults++; + if (submit->aspace) + submit->aspace->faults++; - msm_rd_dump_submit(priv->hangrd, submit, - "offending task: %s (%s)", comm, cmd); - } else { - msm_rd_dump_submit(priv->hangrd, submit, NULL); - } + get_comm_cmdline(submit, &comm, &cmd); + + if (comm && cmd) { + DRM_DEV_ERROR(dev->dev, "%s: offending task: %s (%s)\n", + gpu->name, comm, cmd); + + msm_rd_dump_submit(priv->hangrd, submit, + "offending task: %s (%s)", comm, cmd); } else { - /* - * We couldn't attribute this fault to any particular context, - * so increment the global fault count instead. - */ - gpu->global_faults++; + DRM_DEV_ERROR(dev->dev, "%s: offending task: unknown\n", gpu->name); + + msm_rd_dump_submit(priv->hangrd, submit, NULL); } /* Record the crash state */ @@ -440,6 +442,7 @@ static void recover_worker(struct kthread_work *work) pm_runtime_put(&gpu->pdev->dev); +out_unlock: mutex_unlock(&gpu->lock); msm_gpu_retire(gpu); -- 2.41.0