2022-12-21 08:53:56

by Oded Gabbay

[permalink] [raw]
Subject: [PATCH 1/2] habanalabs/gaudi2: dump event description even if no cause

From: Ofir Bitton <[email protected]>

In order to have the no-cause error print be more informative,
we add the event description in addition to the event id.

Signed-off-by: Ofir Bitton <[email protected]>
Reviewed-by: Oded Gabbay <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
---
drivers/misc/habanalabs/gaudi2/gaudi2.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/habanalabs/gaudi2/gaudi2.c b/drivers/misc/habanalabs/gaudi2/gaudi2.c
index 987ec44fa378..7df1a68dd403 100644
--- a/drivers/misc/habanalabs/gaudi2/gaudi2.c
+++ b/drivers/misc/habanalabs/gaudi2/gaudi2.c
@@ -9271,8 +9271,8 @@ static void gaudi2_handle_eqe(struct hl_device *hdev, struct hl_eq_entry *eq_ent
if (error_count == GAUDI2_NA_EVENT_CAUSE && !is_info_event(event_type))
gaudi2_print_event(hdev, event_type, true, "%d", event_type);
else if (error_count == 0)
- dev_err_ratelimited(hdev->dev,
- "No Error cause for H/W event %d\n", event_type);
+ gaudi2_print_event(hdev, event_type, true,
+ "No error cause for H/W event %u\n", event_type);

if ((gaudi2_irq_map_table[event_type].reset || reset_required) &&
(hdev->hard_reset_on_fw_events ||
--
2.34.1


2022-12-21 08:54:46

by Oded Gabbay

[permalink] [raw]
Subject: [PATCH 2/2] habanalabs: fix dma-buf release handling if dma_buf_fd() fails

From: Tomer Tayar <[email protected]>

The dma-buf private object is freed if a call to dma_buf_fd() fails,
and because a file was already associated with the dma-buf in
dma_buf_export(), the release op will be called and will use this
object.

Mark the 'priv' field as NULL in this case, and avoid accessing it from
the release op.

Signed-off-by: Tomer Tayar <[email protected]>
Reviewed-by: Oded Gabbay <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
---
drivers/misc/habanalabs/common/memory.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/habanalabs/common/memory.c b/drivers/misc/habanalabs/common/memory.c
index 693456366753..a2d24c9a3d1e 100644
--- a/drivers/misc/habanalabs/common/memory.c
+++ b/drivers/misc/habanalabs/common/memory.c
@@ -1782,7 +1782,12 @@ static void hl_unmap_dmabuf(struct dma_buf_attachment *attachment,
static void hl_release_dmabuf(struct dma_buf *dmabuf)
{
struct hl_dmabuf_priv *hl_dmabuf = dmabuf->priv;
- struct hl_ctx *ctx = hl_dmabuf->ctx;
+ struct hl_ctx *ctx;
+
+ if (!hl_dmabuf)
+ return;
+
+ ctx = hl_dmabuf->ctx;

if (hl_dmabuf->memhash_hnode) {
mutex_lock(&ctx->mem_hash_lock);
@@ -1822,7 +1827,7 @@ static int export_dmabuf(struct hl_ctx *ctx,

fd = dma_buf_fd(hl_dmabuf->dmabuf, flags);
if (fd < 0) {
- dev_err(hdev->dev, "failed to get a file descriptor for a dma-buf\n");
+ dev_err(hdev->dev, "failed to get a file descriptor for a dma-buf, %d\n", fd);
rc = fd;
goto err_dma_buf_put;
}
@@ -1835,6 +1840,7 @@ static int export_dmabuf(struct hl_ctx *ctx,
return 0;

err_dma_buf_put:
+ hl_dmabuf->dmabuf->priv = NULL;
dma_buf_put(hl_dmabuf->dmabuf);
return rc;
}
--
2.34.1