Received: by 2002:a05:7412:b101:b0:e2:908c:2ebd with SMTP id az1csp2720044rdb; Wed, 15 Nov 2023 08:40:47 -0800 (PST) X-Google-Smtp-Source: AGHT+IG9pqtXPlThzjeNQFK5f61s5tJkvVgnt/ntyKXvOYRwEhcnXb1+GggU95o6E44uiZO/TEpp X-Received: by 2002:a17:90b:1c8d:b0:27f:f61c:327d with SMTP id oo13-20020a17090b1c8d00b0027ff61c327dmr10011864pjb.0.1700066447439; Wed, 15 Nov 2023 08:40:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700066447; cv=none; d=google.com; s=arc-20160816; b=j3nlRpd/5zRefxQNtvqwVDpFy4i1marI4gdrMdI0KN3W8vKcWgs4U9PjWseWM70OTN Vm67sukq+DPkg55lkC665/78f9ZxVTXhcbpREWkEJMV41qrCjwoz3kHk1u4pmtwV6MDA RwEutKboqWg83RHBy1sO2XlYe3ATIkA3yfUbPCdxgGrcXT0w1/ql37BDBkxdm7IyW9Ie 21zwlCNQp/iCEFery/P3zqjostDQhg1mkgktclnVh8s9Rd7ty4w3CrkRkFZvMdsp4LR8 n70In1cJ7H2O2b5DUaLA0N4tcozdv96/hw0vSgU4mIpmSdO4Z8sREY0/u1afjw9FkbVe GSQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=QEioub7sq64zGE+tyHGoVFw+j+8TmNz9llPFBSsKhIw=; fh=V54BVPs49mTldOrR7dHX4twJjw0Ji4qGRFqX5qRehjA=; b=EQFKU2KD55B3nHeE5M36lM+sDqUAv+iZ1zNgFT5vs4eVzl+f7MdAaPoZ5sfB/NJ/WP tGVGUUQo30ZetIbiz+NegunAmf6J5Dc1yfod70A82cHhNxKUNMAvr4rSqh/PyqmVoaRp GBLdWCzei/Ue3uUU42JxseBLGV5kT3iRviwomOkhKE5/8Wp22+GsMYqz7XgBwV5m1Mv1 MyMAhv2ftuhhCLPmawn5Kn6vULeqK2MfUVwt+GQuckK2bIDsCUueu7XoFWRjADOPYjd4 lDS3ONKSGLIyCh3BM16APRh123SNCD3Cyo5L/mxCuOxqAPgxAC5kUcCmCgxtQ8XiBOpY QMyQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=kVyduxGw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id i3-20020a17090a974300b0028031758019si116782pjw.32.2023.11.15.08.40.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Nov 2023 08:40:47 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=kVyduxGw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 8AF4180283E2; Wed, 15 Nov 2023 08:40:06 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232307AbjKOQja (ORCPT + 99 others); Wed, 15 Nov 2023 11:39:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43800 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231901AbjKOQj1 (ORCPT ); Wed, 15 Nov 2023 11:39:27 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9EDB2FA for ; Wed, 15 Nov 2023 08:39:23 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2E81BC433C9; Wed, 15 Nov 2023 16:39:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700066363; bh=hH/MtPMheBE4FRnp6r7JcZVzm3gOuvyKUwNseHulg94=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kVyduxGw7RGktXNcN7T+SGle97iooX2quLtnli9mnVApOKEa8BtmFwngJ42Km8A32 jARlxlOP9xx6aSCE5ndmrJWMEYasSv0811arbT+bdmSpDodAnf3PY44FCYS+ok/O3b uBsOix7P4nF7OBGQtbnE6JD/7+Nfp+SllJcuYjUzKChNuoO8zis+ifrbTAnDBJczU3 S1M3V2jTYPkMTsMaKM649uBdQJlMDKMR67tlzwYw6BEHvOpnY/fPVRWGq8/Jm/oOeq rR4FIwzNS193Ipxn92zAvkqOBH4Pw+anMtS4zxQ7HaTgkYzunzrrN9VD44/l4PDasM PZrGVAHbIGckA== From: Oded Gabbay To: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: Farah Kassabri Subject: [PATCH 04/10] accel/habanalabs: fix EQ heartbeat mechanism Date: Wed, 15 Nov 2023 18:39:06 +0200 Message-Id: <20231115163912.1243175-4-ogabbay@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231115163912.1243175-1-ogabbay@kernel.org> References: <20231115163912.1243175-1-ogabbay@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.3 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Wed, 15 Nov 2023 08:40:06 -0800 (PST) From: Farah Kassabri Stop rescheduling another heartbeat check when EQ heartbeat check fails as it generates confusing logs in dmesg that the heartbeat fails. Signed-off-by: Farah Kassabri Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/device.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/accel/habanalabs/common/device.c b/drivers/accel/habanalabs/common/device.c index d9447aeb3937..6bf5f1d0d005 100644 --- a/drivers/accel/habanalabs/common/device.c +++ b/drivers/accel/habanalabs/common/device.c @@ -1044,20 +1044,21 @@ static bool is_pci_link_healthy(struct hl_device *hdev) return (vendor_id == PCI_VENDOR_ID_HABANALABS); } -static void hl_device_eq_heartbeat(struct hl_device *hdev) +static int hl_device_eq_heartbeat_check(struct hl_device *hdev) { - u64 event_mask = HL_NOTIFIER_EVENT_DEVICE_RESET | HL_NOTIFIER_EVENT_DEVICE_UNAVAILABLE; struct asic_fixed_properties *prop = &hdev->asic_prop; if (!prop->cpucp_info.eq_health_check_supported) - return; + return 0; if (hdev->eq_heartbeat_received) { hdev->eq_heartbeat_received = false; } else { dev_err(hdev->dev, "EQ heartbeat event was not received!\n"); - hl_device_cond_reset(hdev, HL_DRV_RESET_HARD, event_mask); + return -EIO; } + + return 0; } static void hl_device_heartbeat(struct work_struct *work) @@ -1074,10 +1075,9 @@ static void hl_device_heartbeat(struct work_struct *work) /* * For EQ health check need to check if driver received the heartbeat eq event * in order to validate the eq is working. + * Only if both the EQ is healthy and we managed to send the next heartbeat reschedule. */ - hl_device_eq_heartbeat(hdev); - - if (!hdev->asic_funcs->send_heartbeat(hdev)) + if ((!hl_device_eq_heartbeat_check(hdev)) && (!hdev->asic_funcs->send_heartbeat(hdev))) goto reschedule; if (hl_device_operational(hdev, NULL)) -- 2.34.1