2022-12-31 20:50:00

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 6.1 1/7] ASoC: SOF: Revert: "core: unregister clients and machine drivers in .shutdown"

From: Kai Vehmanen <[email protected]>

[ Upstream commit 44fda61d2bcfb74a942df93959e083a4e8eff75f ]

The unregister machine drivers call is not safe to do when
kexec is used. Kexec-lite gets blocked with following backtrace:

[ 84.943749] Freezing user space processes ... (elapsed 0.111 seconds) done.
[ 246.784446] INFO: task kexec-lite:5123 blocked for more than 122 seconds.
[ 246.819035] Call Trace:
[ 246.821782] <TASK>
[ 246.824186] __schedule+0x5f9/0x1263
[ 246.828231] schedule+0x87/0xc5
[ 246.831779] snd_card_disconnect_sync+0xb5/0x127
...
[ 246.889249] snd_sof_device_shutdown+0xb4/0x150
[ 246.899317] pci_device_shutdown+0x37/0x61
[ 246.903990] device_shutdown+0x14c/0x1d6
[ 246.908391] kernel_kexec+0x45/0xb9

This reverts commit 83bfc7e793b555291785136c3ae86abcdc046887.

Reported-by: Ricardo Ribalda <[email protected]>
Cc: Ricardo Ribalda <[email protected]>
Signed-off-by: Kai Vehmanen <[email protected]>
Reviewed-by: Pierre-Louis Bossart <[email protected]>
Reviewed-by: Péter Ujfalusi <[email protected]>
Reviewed-by: Ranjani Sridharan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
sound/soc/sof/core.c | 9 ---------
1 file changed, 9 deletions(-)

diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c
index 3e6141d03770..625977a29d8a 100644
--- a/sound/soc/sof/core.c
+++ b/sound/soc/sof/core.c
@@ -475,19 +475,10 @@ EXPORT_SYMBOL(snd_sof_device_remove);
int snd_sof_device_shutdown(struct device *dev)
{
struct snd_sof_dev *sdev = dev_get_drvdata(dev);
- struct snd_sof_pdata *pdata = sdev->pdata;

if (IS_ENABLED(CONFIG_SND_SOC_SOF_PROBE_WORK_QUEUE))
cancel_work_sync(&sdev->probe_work);

- /*
- * make sure clients and machine driver(s) are unregistered to force
- * all userspace devices to be closed prior to the DSP shutdown sequence
- */
- sof_unregister_clients(sdev);
-
- snd_sof_machine_unregister(sdev, pdata);
-
if (sdev->fw_state == SOF_FW_BOOT_COMPLETE)
return snd_sof_shutdown(sdev);

--
2.35.1


2022-12-31 21:09:03

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 6.1 4/7] ASoC: SOF: mediatek: initialize panic_info to zero

From: YC Hung <[email protected]>

[ Upstream commit 7bd220f2ba9014b78f0304178103393554b8c4fe ]

Coverity spotted that panic_info is not initialized to zero in
mtk_adsp_dump. Using uninitialized value panic_info.linenum when
calling snd_sof_get_status. Fix this coverity by initializing
panic_info struct as zero.

Signed-off-by: YC Hung <[email protected]>
Reviewed-by: Curtis Malainey <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
sound/soc/sof/mediatek/mtk-adsp-common.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/sof/mediatek/mtk-adsp-common.c b/sound/soc/sof/mediatek/mtk-adsp-common.c
index 1e0769c668a7..de8dbe27cd0d 100644
--- a/sound/soc/sof/mediatek/mtk-adsp-common.c
+++ b/sound/soc/sof/mediatek/mtk-adsp-common.c
@@ -60,7 +60,7 @@ void mtk_adsp_dump(struct snd_sof_dev *sdev, u32 flags)
{
char *level = (flags & SOF_DBG_DUMP_OPTIONAL) ? KERN_DEBUG : KERN_ERR;
struct sof_ipc_dsp_oops_xtensa xoops;
- struct sof_ipc_panic_info panic_info;
+ struct sof_ipc_panic_info panic_info = {};
u32 stack[MTK_ADSP_STACK_DUMP_SIZE];
u32 status;

--
2.35.1

2022-12-31 21:30:36

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 6.1 3/7] ASoC: Intel: bytcr_rt5640: Add quirk for the Advantech MICA-071 tablet

From: Hans de Goede <[email protected]>

[ Upstream commit a1dec9d70b6ad97087b60b81d2492134a84208c6 ]

The Advantech MICA-071 tablet deviates from the defaults for
a non CR Bay Trail based tablet in several ways:

1. It uses an analog MIC on IN3 rather then using DMIC1
2. It only has 1 speaker
3. It needs the OVCD current threshold to be set to 1500uA instead of
the default 2000uA to reliable differentiate between headphones vs
headsets

Add a quirk with these settings for this tablet.

Signed-off-by: Hans de Goede <[email protected]>
Acked-by: Pierre-Louis Bossart <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
sound/soc/intel/boards/bytcr_rt5640.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/sound/soc/intel/boards/bytcr_rt5640.c b/sound/soc/intel/boards/bytcr_rt5640.c
index fb9d9e271845..ddd2625bed90 100644
--- a/sound/soc/intel/boards/bytcr_rt5640.c
+++ b/sound/soc/intel/boards/bytcr_rt5640.c
@@ -570,6 +570,21 @@ static const struct dmi_system_id byt_rt5640_quirk_table[] = {
BYT_RT5640_SSP0_AIF1 |
BYT_RT5640_MCLK_EN),
},
+ {
+ /* Advantech MICA-071 */
+ .matches = {
+ DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Advantech"),
+ DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "MICA-071"),
+ },
+ /* OVCD Th = 1500uA to reliable detect head-phones vs -set */
+ .driver_data = (void *)(BYT_RT5640_IN3_MAP |
+ BYT_RT5640_JD_SRC_JD2_IN4N |
+ BYT_RT5640_OVCD_TH_1500UA |
+ BYT_RT5640_OVCD_SF_0P75 |
+ BYT_RT5640_MONO_SPEAKER |
+ BYT_RT5640_DIFF_MIC |
+ BYT_RT5640_MCLK_EN),
+ },
{
.matches = {
DMI_EXACT_MATCH(DMI_SYS_VENDOR, "ARCHOS"),
--
2.35.1

2022-12-31 21:33:19

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 6.1 6/7] drm/amdkfd: Fix kfd_process_device_init_vm error handling

From: Philip Yang <[email protected]>

[ Upstream commit 29d48b87db64b6697ddad007548e51d032081c59 ]

Should only destroy the ib_mem and let process cleanup worker to free
the outstanding BOs. Reset the pointer in pdd->qpd structure, to avoid
NULL pointer access in process destroy worker.

BUG: kernel NULL pointer dereference, address: 0000000000000010
Call Trace:
amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel+0x46/0xb0 [amdgpu]
kfd_process_device_destroy_cwsr_dgpu+0x40/0x70 [amdgpu]
kfd_process_destroy_pdds+0x71/0x190 [amdgpu]
kfd_process_wq_release+0x2a2/0x3b0 [amdgpu]
process_one_work+0x2a1/0x600
worker_thread+0x39/0x3d0

Signed-off-by: Philip Yang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 951b63677248..9821fa9268d3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -689,13 +689,13 @@ void kfd_process_destroy_wq(void)
}

static void kfd_process_free_gpuvm(struct kgd_mem *mem,
- struct kfd_process_device *pdd, void *kptr)
+ struct kfd_process_device *pdd, void **kptr)
{
struct kfd_dev *dev = pdd->dev;

- if (kptr) {
+ if (kptr && *kptr) {
amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel(mem);
- kptr = NULL;
+ *kptr = NULL;
}

amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu(dev->adev, mem, pdd->drm_priv);
@@ -795,7 +795,7 @@ static void kfd_process_device_destroy_ib_mem(struct kfd_process_device *pdd)
if (!qpd->ib_kaddr || !qpd->ib_base)
return;

- kfd_process_free_gpuvm(qpd->ib_mem, pdd, qpd->ib_kaddr);
+ kfd_process_free_gpuvm(qpd->ib_mem, pdd, &qpd->ib_kaddr);
}

struct kfd_process *kfd_create_process(struct file *filep)
@@ -1277,7 +1277,7 @@ static void kfd_process_device_destroy_cwsr_dgpu(struct kfd_process_device *pdd)
if (!dev->cwsr_enabled || !qpd->cwsr_kaddr || !qpd->cwsr_base)
return;

- kfd_process_free_gpuvm(qpd->cwsr_mem, pdd, qpd->cwsr_kaddr);
+ kfd_process_free_gpuvm(qpd->cwsr_mem, pdd, &qpd->cwsr_kaddr);
}

void kfd_process_set_trap_handler(struct qcm_process_device *qpd,
@@ -1598,8 +1598,8 @@ int kfd_process_device_init_vm(struct kfd_process_device *pdd,
return 0;

err_init_cwsr:
+ kfd_process_device_destroy_ib_mem(pdd);
err_reserve_ib_mem:
- kfd_process_device_free_bos(pdd);
pdd->drm_priv = NULL;

return ret;
--
2.35.1

2022-12-31 21:49:42

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 6.1 2/7] 9p/client: fix data race on req->status

From: Dominique Martinet <[email protected]>

[ Upstream commit 1a4f69ef15ec29b213e2b086b2502644e8ef76ee ]

KCSAN reported a race between writing req->status in p9_client_cb and
accessing it in p9_client_rpc's wait_event.

Accesses to req itself is protected by the data barrier (writing req
fields, write barrier, writing status // reading status, read barrier,
reading other req fields), but status accesses themselves apparently
also must be annotated properly with WRITE_ONCE/READ_ONCE when we
access it without locks.

Follows:
- error paths writing status in various threads all can notify
p9_client_rpc, so these all also need WRITE_ONCE
- there's a similar read loop in trans_virtio for zc case that also
needs READ_ONCE
- other reads in trans_fd should be protected by the trans_fd lock and
lists state machine, as corresponding writers all are within trans_fd
and should be under the same lock. If KCSAN complains on them we likely
will have something else to fix as well, so it's better to leave them
unmarked and look again if required.

Link: https://lkml.kernel.org/r/[email protected]
Reported-by: Naresh Kamboju <[email protected]>
Suggested-by: Marco Elver <[email protected]>
Acked-by: Marco Elver <[email protected]>
Reviewed-by: Christian Schoenebeck <[email protected]>
Signed-off-by: Dominique Martinet <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
net/9p/client.c | 15 ++++++++-------
net/9p/trans_fd.c | 12 ++++++------
net/9p/trans_rdma.c | 4 ++--
net/9p/trans_virtio.c | 9 +++++----
net/9p/trans_xen.c | 4 ++--
5 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/net/9p/client.c b/net/9p/client.c
index aaa37b07e30a..29565e021495 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -438,7 +438,7 @@ void p9_client_cb(struct p9_client *c, struct p9_req_t *req, int status)
* the status change is visible to another thread
*/
smp_wmb();
- req->status = status;
+ WRITE_ONCE(req->status, status);

wake_up(&req->wq);
p9_debug(P9_DEBUG_MUX, "wakeup: %d\n", req->tc.tag);
@@ -600,7 +600,7 @@ static int p9_client_flush(struct p9_client *c, struct p9_req_t *oldreq)
/* if we haven't received a response for oldreq,
* remove it from the list
*/
- if (oldreq->status == REQ_STATUS_SENT) {
+ if (READ_ONCE(oldreq->status) == REQ_STATUS_SENT) {
if (c->trans_mod->cancelled)
c->trans_mod->cancelled(c, oldreq);
}
@@ -697,7 +697,8 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
}
again:
/* Wait for the response */
- err = wait_event_killable(req->wq, req->status >= REQ_STATUS_RCVD);
+ err = wait_event_killable(req->wq,
+ READ_ONCE(req->status) >= REQ_STATUS_RCVD);

/* Make sure our req is coherent with regard to updates in other
* threads - echoes to wmb() in the callback
@@ -711,7 +712,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
goto again;
}

- if (req->status == REQ_STATUS_ERROR) {
+ if (READ_ONCE(req->status) == REQ_STATUS_ERROR) {
p9_debug(P9_DEBUG_ERROR, "req_status error %d\n", req->t_err);
err = req->t_err;
}
@@ -724,7 +725,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
p9_client_flush(c, req);

/* if we received the response anyway, don't signal error */
- if (req->status == REQ_STATUS_RCVD)
+ if (READ_ONCE(req->status) == REQ_STATUS_RCVD)
err = 0;
}
recalc_sigpending:
@@ -793,7 +794,7 @@ static struct p9_req_t *p9_client_zc_rpc(struct p9_client *c, int8_t type,
if (err != -ERESTARTSYS)
goto recalc_sigpending;
}
- if (req->status == REQ_STATUS_ERROR) {
+ if (READ_ONCE(req->status) == REQ_STATUS_ERROR) {
p9_debug(P9_DEBUG_ERROR, "req_status error %d\n", req->t_err);
err = req->t_err;
}
@@ -806,7 +807,7 @@ static struct p9_req_t *p9_client_zc_rpc(struct p9_client *c, int8_t type,
p9_client_flush(c, req);

/* if we received the response anyway, don't signal error */
- if (req->status == REQ_STATUS_RCVD)
+ if (READ_ONCE(req->status) == REQ_STATUS_RCVD)
err = 0;
}
recalc_sigpending:
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index 07db2f436d44..5a1aecf7fe48 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -202,11 +202,11 @@ static void p9_conn_cancel(struct p9_conn *m, int err)

list_for_each_entry_safe(req, rtmp, &m->req_list, req_list) {
list_move(&req->req_list, &cancel_list);
- req->status = REQ_STATUS_ERROR;
+ WRITE_ONCE(req->status, REQ_STATUS_ERROR);
}
list_for_each_entry_safe(req, rtmp, &m->unsent_req_list, req_list) {
list_move(&req->req_list, &cancel_list);
- req->status = REQ_STATUS_ERROR;
+ WRITE_ONCE(req->status, REQ_STATUS_ERROR);
}

spin_unlock(&m->req_lock);
@@ -467,7 +467,7 @@ static void p9_write_work(struct work_struct *work)

req = list_entry(m->unsent_req_list.next, struct p9_req_t,
req_list);
- req->status = REQ_STATUS_SENT;
+ WRITE_ONCE(req->status, REQ_STATUS_SENT);
p9_debug(P9_DEBUG_TRANS, "move req %p\n", req);
list_move_tail(&req->req_list, &m->req_list);

@@ -676,7 +676,7 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
return m->err;

spin_lock(&m->req_lock);
- req->status = REQ_STATUS_UNSENT;
+ WRITE_ONCE(req->status, REQ_STATUS_UNSENT);
list_add_tail(&req->req_list, &m->unsent_req_list);
spin_unlock(&m->req_lock);

@@ -703,7 +703,7 @@ static int p9_fd_cancel(struct p9_client *client, struct p9_req_t *req)

if (req->status == REQ_STATUS_UNSENT) {
list_del(&req->req_list);
- req->status = REQ_STATUS_FLSHD;
+ WRITE_ONCE(req->status, REQ_STATUS_FLSHD);
p9_req_put(client, req);
ret = 0;
}
@@ -732,7 +732,7 @@ static int p9_fd_cancelled(struct p9_client *client, struct p9_req_t *req)
* remove it from the list.
*/
list_del(&req->req_list);
- req->status = REQ_STATUS_FLSHD;
+ WRITE_ONCE(req->status, REQ_STATUS_FLSHD);
spin_unlock(&m->req_lock);

p9_req_put(client, req);
diff --git a/net/9p/trans_rdma.c b/net/9p/trans_rdma.c
index 6ff706760676..e9a830c69058 100644
--- a/net/9p/trans_rdma.c
+++ b/net/9p/trans_rdma.c
@@ -507,7 +507,7 @@ static int rdma_request(struct p9_client *client, struct p9_req_t *req)
* because doing if after could erase the REQ_STATUS_RCVD
* status in case of a very fast reply.
*/
- req->status = REQ_STATUS_SENT;
+ WRITE_ONCE(req->status, REQ_STATUS_SENT);
err = ib_post_send(rdma->qp, &wr, NULL);
if (err)
goto send_error;
@@ -517,7 +517,7 @@ static int rdma_request(struct p9_client *client, struct p9_req_t *req)

/* Handle errors that happened during or while preparing the send: */
send_error:
- req->status = REQ_STATUS_ERROR;
+ WRITE_ONCE(req->status, REQ_STATUS_ERROR);
kfree(c);
p9_debug(P9_DEBUG_ERROR, "Error %d in rdma_request()\n", err);

diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
index e757f0601304..3f3eb03cda7d 100644
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -263,7 +263,7 @@ p9_virtio_request(struct p9_client *client, struct p9_req_t *req)

p9_debug(P9_DEBUG_TRANS, "9p debug: virtio request\n");

- req->status = REQ_STATUS_SENT;
+ WRITE_ONCE(req->status, REQ_STATUS_SENT);
req_retry:
spin_lock_irqsave(&chan->lock, flags);

@@ -469,7 +469,7 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req,
inlen = n;
}
}
- req->status = REQ_STATUS_SENT;
+ WRITE_ONCE(req->status, REQ_STATUS_SENT);
req_retry_pinned:
spin_lock_irqsave(&chan->lock, flags);

@@ -532,9 +532,10 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req,
spin_unlock_irqrestore(&chan->lock, flags);
kicked = 1;
p9_debug(P9_DEBUG_TRANS, "virtio request kicked\n");
- err = wait_event_killable(req->wq, req->status >= REQ_STATUS_RCVD);
+ err = wait_event_killable(req->wq,
+ READ_ONCE(req->status) >= REQ_STATUS_RCVD);
// RERROR needs reply (== error string) in static data
- if (req->status == REQ_STATUS_RCVD &&
+ if (READ_ONCE(req->status) == REQ_STATUS_RCVD &&
unlikely(req->rc.sdata[4] == P9_RERROR))
handle_rerror(req, in_hdr_len, offs, in_pages);

diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
index aaa5fd364691..cf1b89ba522b 100644
--- a/net/9p/trans_xen.c
+++ b/net/9p/trans_xen.c
@@ -157,7 +157,7 @@ static int p9_xen_request(struct p9_client *client, struct p9_req_t *p9_req)
&masked_prod, masked_cons,
XEN_9PFS_RING_SIZE(ring));

- p9_req->status = REQ_STATUS_SENT;
+ WRITE_ONCE(p9_req->status, REQ_STATUS_SENT);
virt_wmb(); /* write ring before updating pointer */
prod += size;
ring->intf->out_prod = prod;
@@ -212,7 +212,7 @@ static void p9_xen_response(struct work_struct *work)
dev_warn(&priv->dev->dev,
"requested packet size too big: %d for tag %d with capacity %zd\n",
h.size, h.tag, req->rc.capacity);
- req->status = REQ_STATUS_ERROR;
+ WRITE_ONCE(req->status, REQ_STATUS_ERROR);
goto recv_error;
}

--
2.35.1

2022-12-31 21:51:02

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 6.1 7/7] drm/amdkfd: Fix double release compute pasid

From: Philip Yang <[email protected]>

[ Upstream commit 1a799c4c190ea9f0e81028e3eb3037ed0ab17ff5 ]

If kfd_process_device_init_vm returns failure after vm is converted to
compute vm and vm->pasid set to compute pasid, KFD will not take
pdd->drm_file reference. As a result, drm close file handler maybe
called to release the compute pasid before KFD process destroy worker to
release the same pasid and set vm->pasid to zero, this generates below
WARNING backtrace and NULL pointer access.

Add helper amdgpu_amdkfd_gpuvm_set_vm_pasid and call it at the last step
of kfd_process_device_init_vm, to ensure vm pasid is the original pasid
if acquiring vm failed or is the compute pasid with pdd->drm_file
reference taken to avoid double release same pasid.

amdgpu: Failed to create process VM object
ida_free called for id=32770 which is not allocated.
WARNING: CPU: 57 PID: 72542 at ../lib/idr.c:522 ida_free+0x96/0x140
RIP: 0010:ida_free+0x96/0x140
Call Trace:
amdgpu_pasid_free_delayed+0xe1/0x2a0 [amdgpu]
amdgpu_driver_postclose_kms+0x2d8/0x340 [amdgpu]
drm_file_free.part.13+0x216/0x270 [drm]
drm_close_helper.isra.14+0x60/0x70 [drm]
drm_release+0x6e/0xf0 [drm]
__fput+0xcc/0x280
____fput+0xe/0x20
task_work_run+0x96/0xc0
do_exit+0x3d0/0xc10

BUG: kernel NULL pointer dereference, address: 0000000000000000
RIP: 0010:ida_free+0x76/0x140
Call Trace:
amdgpu_pasid_free_delayed+0xe1/0x2a0 [amdgpu]
amdgpu_driver_postclose_kms+0x2d8/0x340 [amdgpu]
drm_file_free.part.13+0x216/0x270 [drm]
drm_close_helper.isra.14+0x60/0x70 [drm]
drm_release+0x6e/0xf0 [drm]
__fput+0xcc/0x280
____fput+0xe/0x20
task_work_run+0x96/0xc0
do_exit+0x3d0/0xc10

Signed-off-by: Philip Yang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 4 +-
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 39 +++++++++++++------
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 12 ++++--
3 files changed, 40 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 647220a8762d..30f145dc8724 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -265,8 +265,10 @@ int amdgpu_amdkfd_get_pcie_bandwidth_mbytes(struct amdgpu_device *adev, bool is_
(&((struct amdgpu_fpriv *) \
((struct drm_file *)(drm_priv))->driver_priv)->vm)

+int amdgpu_amdkfd_gpuvm_set_vm_pasid(struct amdgpu_device *adev,
+ struct file *filp, u32 pasid);
int amdgpu_amdkfd_gpuvm_acquire_process_vm(struct amdgpu_device *adev,
- struct file *filp, u32 pasid,
+ struct file *filp,
void **process_info,
struct dma_fence **ef);
void amdgpu_amdkfd_gpuvm_release_process_vm(struct amdgpu_device *adev,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 1f76e27f1a35..18a7cbbd00ff 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1473,10 +1473,9 @@ static void amdgpu_amdkfd_gpuvm_unpin_bo(struct amdgpu_bo *bo)
amdgpu_bo_unreserve(bo);
}

-int amdgpu_amdkfd_gpuvm_acquire_process_vm(struct amdgpu_device *adev,
- struct file *filp, u32 pasid,
- void **process_info,
- struct dma_fence **ef)
+int amdgpu_amdkfd_gpuvm_set_vm_pasid(struct amdgpu_device *adev,
+ struct file *filp, u32 pasid)
+
{
struct amdgpu_fpriv *drv_priv;
struct amdgpu_vm *avm;
@@ -1487,10 +1486,6 @@ int amdgpu_amdkfd_gpuvm_acquire_process_vm(struct amdgpu_device *adev,
return ret;
avm = &drv_priv->vm;

- /* Already a compute VM? */
- if (avm->process_info)
- return -EINVAL;
-
/* Free the original amdgpu allocated pasid,
* will be replaced with kfd allocated pasid.
*/
@@ -1499,14 +1494,36 @@ int amdgpu_amdkfd_gpuvm_acquire_process_vm(struct amdgpu_device *adev,
amdgpu_vm_set_pasid(adev, avm, 0);
}

- /* Convert VM into a compute VM */
- ret = amdgpu_vm_make_compute(adev, avm);
+ ret = amdgpu_vm_set_pasid(adev, avm, pasid);
if (ret)
return ret;

- ret = amdgpu_vm_set_pasid(adev, avm, pasid);
+ return 0;
+}
+
+int amdgpu_amdkfd_gpuvm_acquire_process_vm(struct amdgpu_device *adev,
+ struct file *filp,
+ void **process_info,
+ struct dma_fence **ef)
+{
+ struct amdgpu_fpriv *drv_priv;
+ struct amdgpu_vm *avm;
+ int ret;
+
+ ret = amdgpu_file_to_fpriv(filp, &drv_priv);
if (ret)
return ret;
+ avm = &drv_priv->vm;
+
+ /* Already a compute VM? */
+ if (avm->process_info)
+ return -EINVAL;
+
+ /* Convert VM into a compute VM */
+ ret = amdgpu_vm_make_compute(adev, avm);
+ if (ret)
+ return ret;
+
/* Initialize KFD part of the VM and process info */
ret = init_kfd_vm(avm, process_info, ef);
if (ret)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 9821fa9268d3..dd351105c1bc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1576,9 +1576,9 @@ int kfd_process_device_init_vm(struct kfd_process_device *pdd,
p = pdd->process;
dev = pdd->dev;

- ret = amdgpu_amdkfd_gpuvm_acquire_process_vm(
- dev->adev, drm_file, p->pasid,
- &p->kgd_process_info, &p->ef);
+ ret = amdgpu_amdkfd_gpuvm_acquire_process_vm(dev->adev, drm_file,
+ &p->kgd_process_info,
+ &p->ef);
if (ret) {
pr_err("Failed to create process VM object\n");
return ret;
@@ -1593,10 +1593,16 @@ int kfd_process_device_init_vm(struct kfd_process_device *pdd,
if (ret)
goto err_init_cwsr;

+ ret = amdgpu_amdkfd_gpuvm_set_vm_pasid(dev->adev, drm_file, p->pasid);
+ if (ret)
+ goto err_set_pasid;
+
pdd->drm_file = drm_file;

return 0;

+err_set_pasid:
+ kfd_process_device_destroy_cwsr_dgpu(pdd);
err_init_cwsr:
kfd_process_device_destroy_ib_mem(pdd);
err_reserve_ib_mem:
--
2.35.1

2023-01-04 12:45:30

by Kai Vehmanen

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 6.1 1/7] ASoC: SOF: Revert: "core: unregister clients and machine drivers in .shutdown"

Hi,

On Sat, 31 Dec 2022, Sasha Levin wrote:

> From: Kai Vehmanen <[email protected]>
>
> [ Upstream commit 44fda61d2bcfb74a942df93959e083a4e8eff75f ]
>
> The unregister machine drivers call is not safe to do when
> kexec is used. Kexec-lite gets blocked with following backtrace:

this should be picked together with commit 2aa2a5ead0e ("ASoC: SOF: Intel:
pci-tgl: unblock S5 entry if DMA stop has failed"), to not bring back old
bugs (system failures to enter S5 on shutdown). The revert patch
unfortunately fails to mention this dependency.

If I'm too late with my reply, I can send the second patch separately to
stable.

Br, Kai

2023-01-09 12:44:26

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 6.1 1/7] ASoC: SOF: Revert: "core: unregister clients and machine drivers in .shutdown"

On Wed, Jan 04, 2023 at 02:34:55PM +0200, Kai Vehmanen wrote:
>Hi,
>
>On Sat, 31 Dec 2022, Sasha Levin wrote:
>
>> From: Kai Vehmanen <[email protected]>
>>
>> [ Upstream commit 44fda61d2bcfb74a942df93959e083a4e8eff75f ]
>>
>> The unregister machine drivers call is not safe to do when
>> kexec is used. Kexec-lite gets blocked with following backtrace:
>
>this should be picked together with commit 2aa2a5ead0e ("ASoC: SOF: Intel:
>pci-tgl: unblock S5 entry if DMA stop has failed"), to not bring back old
>bugs (system failures to enter S5 on shutdown). The revert patch
>unfortunately fails to mention this dependency.
>
>If I'm too late with my reply, I can send the second patch separately to
>stable.

I took 2aa2a5ead0e along with this patch, thanks!

--
Thanks,
Sasha