Received: by 2002:a05:6a10:9afc:0:0:0:0 with SMTP id t28csp286850pxm; Wed, 2 Mar 2022 15:28:55 -0800 (PST) X-Google-Smtp-Source: ABdhPJzyI2ySlX2Mb7vDnvbLzO+XLR51EUkHct1c1Ua4Vq2+EzTW+wIZ+N+kwnURle4Dk89sDYfy X-Received: by 2002:a17:903:2441:b0:151:95e2:aa44 with SMTP id l1-20020a170903244100b0015195e2aa44mr5914401pls.150.1646263735707; Wed, 02 Mar 2022 15:28:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1646263735; cv=none; d=google.com; s=arc-20160816; b=IEnNY3bVkRi94ciVH/VwwLb7W9ojWyHChS1avsMwSbhaOoU0nJQvYPNyc6S0zKWma8 w7qUXUJLmhpLpUE6ZZ8kfKs5D1lG7mc0aQ0CG7UaDvfaPMmqiZs5In+1Vdkk5pScBch1 exB9TWWk3o772TGLPKbQe/OUWqlk6nU021nz0pFveBg6DRCfm2J2XO1M9c8dRWLTFVGj LHJH8Plexo9cZPGt69yKtpqhmXUPu7/0lGc/iEaq71yxyyIaZzh5eZ0uLV0HbuZqKntb SMfdCX65pjobvJffeVBrxcCOGMKSkFSssa6sv5CJErnFdr6p0JtPF4uycLdFOUn+LN7E TNkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:subject:cc:to:from:dmarc-filter :sender:dkim-signature; bh=Qa4oeOoMQCNOhiFhjfjB7NxSM+9SxKiZPhTh4vJHGJU=; b=ehnZo6KsIXkk/l9liD3j6/V5hbK5dSQPHLnoQB1lVRD4c+mFBUKMz58uEN/qecvRDG lpfb85uREetCNSycNJqiXkT8zbCcrtUUeuxG+Cv2b6H8FZ8ZjVLWFf8fgNhQheTDrdRZ Q3Hc/4YT5FPbSgDdQLpDqo3y5wTwnLu/XKY1VoENXKE4Ihp85nUMVblgLgwWrb2c7DaO 49CCXxnH4Rh/+7nKoxZjiomDhjGkLdcApfi5qXaSeIt7yTC3zItXLfznLeiSFw6QKxn2 xxF+KZP3p+GDvbyDTc2EAqQqt5/6RTJQMVz1/ygruwZCH0gop/CVv+xQv/8fcdghnmfW WM2g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@mg.codeaurora.org header.s=smtp header.b=AB3KUlhe; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id p13-20020a63f44d000000b003743bb5c4d0si350782pgk.823.2022.03.02.15.28.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Mar 2022 15:28:55 -0800 (PST) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@mg.codeaurora.org header.s=smtp header.b=AB3KUlhe; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id CA198129BA8; Wed, 2 Mar 2022 15:00:44 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243828AbiCBR2x (ORCPT + 99 others); Wed, 2 Mar 2022 12:28:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60390 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243860AbiCBR2l (ORCPT ); Wed, 2 Mar 2022 12:28:41 -0500 Received: from so254-9.mailgun.net (so254-9.mailgun.net [198.61.254.9]) by lindbergh.monkeyblade.net (Postfix) with UTF8SMTPS id 0BF263FD84 for ; Wed, 2 Mar 2022 09:27:55 -0800 (PST) DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=mg.codeaurora.org; q=dns/txt; s=smtp; t=1646242077; h=Message-Id: Date: Subject: Cc: To: From: Sender; bh=Qa4oeOoMQCNOhiFhjfjB7NxSM+9SxKiZPhTh4vJHGJU=; b=AB3KUlheRWPAD9tGQgQ5Kk9r1ezy9gArMJaY0RFdPKbhzKKz+bE9we9+TynJZbIo3oKFgeqF V1XrE1AHqNFEpBAMm8ATip1iJ3g+mI0OfJJ4iRXASxyQ/FwaWdmKz+6tP5gl3eTBplB7thw0 8TsyfnoRY1WIdj5CeCXwZhX2JKA= X-Mailgun-Sending-Ip: 198.61.254.9 X-Mailgun-Sid: WyI0MWYwYSIsICJsaW51eC1rZXJuZWxAdmdlci5rZXJuZWwub3JnIiwgImJlOWU0YSJd Received: from smtp.codeaurora.org (ec2-35-166-182-171.us-west-2.compute.amazonaws.com [35.166.182.171]) by smtp-out-n05.prod.us-east-1.postgun.com with SMTP id 621fa917e1c212bb9c1e992c (version=TLS1.2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256); Wed, 02 Mar 2022 17:27:51 GMT Sender: quic_akhilpo=quicinc.com@mg.codeaurora.org Received: by smtp.codeaurora.org (Postfix, from userid 1001) id A23FCC4363F; Wed, 2 Mar 2022 17:27:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 Received: from hyd-lnxbld559.qualcomm.com (unknown [202.46.22.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: akhilpo) by smtp.codeaurora.org (Postfix) with ESMTPSA id 8C826C4338F; Wed, 2 Mar 2022 17:27:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 smtp.codeaurora.org 8C826C4338F Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; dmarc=fail (p=none dis=none) header.from=quicinc.com Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; spf=fail smtp.mailfrom=quicinc.com From: Akhil P Oommen To: freedreno , dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, Rob Clark , Dmitry Baryshkov , Bjorn Andersson Cc: Abhinav Kumar , AngeloGioacchino Del Regno , =?UTF-8?q?Christian=20K=C3=B6nig?= , Dan Carpenter , Daniel Vetter , David Airlie , Dmitry Osipenko , Douglas Anderson , Emma Anholt , Jonathan Marek , Jordan Crouse , Sean Paul , Stephen Boyd , Viresh Kumar , Vladimir Lypak , Wang Qing , Yangtao Li , linux-kernel@vger.kernel.org Subject: [PATCH v1 00/10] Support for GMU coredump and some related improvements Date: Wed, 2 Mar 2022 22:57:26 +0530 Message-Id: <1646242056-2456-1-git-send-email-quic_akhilpo@quicinc.com> X-Mailer: git-send-email 2.7.4 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Major enhancement in this series is the support for a minimal gmu coredump which can be captured inline instead of through our usual recover worker. It is helpful in the case of gmu errors during gpu wake-up/suspend path and helps to capture a snapshot of gmu before we do a suspend. I had to introduce a lock to synchronize the crashstate because the runtime-suspend can happen from an asynchronous RPM thread. Apart from this, there are some improvements to gracefully handle the gmu errors by propagating the error back to parent or by retrying. Also, a few patches to fix some trivial bugs in the related code. Akhil P Oommen (10): drm/msm/a6xx: Add helper to check smmu is stalled drm/msm/a6xx: Send NMI to gmu when it is hung drm/msm/a6xx: Avoid gmu lock in pm ops drm/msm/a6xx: Enhance debugging of gmu faults drm/msm: Do recovery on hw_init failure drm/msm/a6xx: Propagate OOB set error drm/msm/adreno: Retry on gpu resume failure drm/msm/a6xx: Remove clk votes on failure drm/msm: Remove pm_runtime_get() from msm_job_run() drm/msm/a6xx: Free gmu_debug crashstate bo drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 89 +++++++++++++++++++++++------ drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 1 + drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 31 +++++++--- drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 4 +- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 79 +++++++++++++++++++++---- drivers/gpu/drm/msm/adreno/adreno_device.c | 10 +++- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 10 +++- drivers/gpu/drm/msm/adreno/adreno_gpu.h | 2 + drivers/gpu/drm/msm/msm_gpu.c | 28 ++++++++- drivers/gpu/drm/msm/msm_gpu.h | 11 ++-- drivers/gpu/drm/msm/msm_ringbuffer.c | 4 -- 11 files changed, 218 insertions(+), 51 deletions(-) -- 2.7.4