2019-07-21 14:29:45

by Oded Gabbay

[permalink] [raw]
Subject: [PATCH] habanalabs: don't reset device when getting VRHOT

VRHOT event from the F/W indicates the device has reached a temperature of
100 Celsius degrees. In this case, the driver should only print this
information to the kernel log. The device will shutdown itself
automatically when reaching 125 degrees.

Signed-off-by: Oded Gabbay <[email protected]>
---
drivers/misc/habanalabs/goya/goya.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/habanalabs/goya/goya.c b/drivers/misc/habanalabs/goya/goya.c
index 60e509f64051..1a2c062a57d4 100644
--- a/drivers/misc/habanalabs/goya/goya.c
+++ b/drivers/misc/habanalabs/goya/goya.c
@@ -4449,7 +4449,6 @@ void goya_handle_eqe(struct hl_device *hdev, struct hl_eq_entry *eq_entry)
case GOYA_ASYNC_EVENT_ID_AXI_ECC:
case GOYA_ASYNC_EVENT_ID_L2_RAM_ECC:
case GOYA_ASYNC_EVENT_ID_PSOC_GPIO_05_SW_RESET:
- case GOYA_ASYNC_EVENT_ID_PSOC_GPIO_10_VRHOT_ICRIT:
goya_print_irq_info(hdev, event_type, false);
hl_device_reset(hdev, true, false);
break;
@@ -4485,6 +4484,7 @@ void goya_handle_eqe(struct hl_device *hdev, struct hl_eq_entry *eq_entry)
goya_unmask_irq(hdev, event_type);
break;

+ case GOYA_ASYNC_EVENT_ID_PSOC_GPIO_10_VRHOT_ICRIT:
case GOYA_ASYNC_EVENT_ID_TPC0_BMON_SPMU:
case GOYA_ASYNC_EVENT_ID_TPC1_BMON_SPMU:
case GOYA_ASYNC_EVENT_ID_TPC2_BMON_SPMU:
--
2.17.1


2019-07-22 12:36:42

by Omer Shpigelman

[permalink] [raw]
Subject: RE: [PATCH] habanalabs: don't reset device when getting VRHOT

From: Oded Gabbay <[email protected]>
Sent: Sunday, 21 July 2019 17:28

> VRHOT event from the F/W indicates the device has reached a temperature
> of
> 100 Celsius degrees. In this case, the driver should only print this information
> to the kernel log. The device will shutdown itself automatically when
> reaching 125 degrees.
>
> Signed-off-by: Oded Gabbay <[email protected]>

Reviewed-by: Omer Shpigelman <[email protected]>