2019-10-30 15:25:59

by Daniel Wagner

[permalink] [raw]
Subject: [RFC] nvmet: Always remove processed AER elements from list

All async events are enqueued via nvmet_add_async_event() which
updates the ctrl->async_event_cmds[] array and additionally an struct
nvmet_async_event is added to the ctrl->async_events list.

Under normal operations the nvmet_async_event_work() updates again the
ctrl->async_event_cmds and removes the corresponding struct
nvmet_async_event from the list again. Though nvmet_sq_destroy() could
be called which calles nvmet_async_events_free() which only updates
the ctrl->async_event_cmds[] array.

Add a new function nvmet_async_events_process() which processes the
async events and updates both array and the list. With this we avoid
having two places where the array and list are modified.

Cc: Christoph Hellwig <[email protected]>
Cc: Sagi Grimberg <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Signed-off-by: Daniel Wagner <[email protected]>
---
Hi,

I've got following oops:

PID: 79413 TASK: ffff92f03a814ec0 CPU: 19 COMMAND: "kworker/19:2"
#0 [ffffa5308b8c3c58] machine_kexec at ffffffff8e05dd02
#1 [ffffa5308b8c3ca8] __crash_kexec at ffffffff8e12102a
#2 [ffffa5308b8c3d68] crash_kexec at ffffffff8e122019
#3 [ffffa5308b8c3d80] oops_end at ffffffff8e02e091
#4 [ffffa5308b8c3da0] general_protection at ffffffff8e8015c5
[exception RIP: nvmet_async_event_work+94]
RIP: ffffffffc0d9a80e RSP: ffffa5308b8c3e58 RFLAGS: 00010202
RAX: dead000000000100 RBX: ffff92dcbc7464b0 RCX: 0000000000000002
RDX: 0000000000040002 RSI: 38ffff92dc9814cf RDI: ffff92f217722f20
RBP: ffff92dcbc746418 R8: 0000000000000000 R9: 0000000000000000
R10: 000000000000035b R11: ffff92efb8dd2091 R12: ffff92dcbc7464a0
R13: ffff92dbe03a5f29 R14: 0000000000000000 R15: 0ffff92f92f26864
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#5 [ffffa5308b8c3e78] process_one_work at ffffffff8e0a3b0c
#6 [ffffa5308b8c3eb8] worker_thread at ffffffff8e0a41e7
#7 [ffffa5308b8c3f10] kthread at ffffffff8e0a93af
#8 [ffffa5308b8c3f50] ret_from_fork at ffffffff8e800235

this maps to nvmet_async_event_results. So it looks like this function
access a stale aen pointer:

static u32 nvmet_async_event_result(struct nvmet_async_event *aen)
{
return aen->event_type | (aen->event_info << 8) | (aen->log_page << 16);
}

The test which is executed removes during operation one of the nodes.

The only odd thing I could figure out so far is that the list is not
updated in sync with the array in all cases. I suppose this could be
the source of the crash. Not verified yet.

Johannes though it makes sense to post this as question.

Thanks,
Daniel

drivers/nvme/target/core.c | 39 +++++++++++++++++----------------------
1 file changed, 17 insertions(+), 22 deletions(-)

diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index 3a67e244e568..9d9efd585847 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -129,27 +129,8 @@ static u32 nvmet_async_event_result(struct nvmet_async_event *aen)
return aen->event_type | (aen->event_info << 8) | (aen->log_page << 16);
}

-static void nvmet_async_events_free(struct nvmet_ctrl *ctrl)
+static void nvmet_async_events_process(struct nvmet_ctrl *ctrl, u16 status)
{
- struct nvmet_req *req;
-
- while (1) {
- mutex_lock(&ctrl->lock);
- if (!ctrl->nr_async_event_cmds) {
- mutex_unlock(&ctrl->lock);
- return;
- }
-
- req = ctrl->async_event_cmds[--ctrl->nr_async_event_cmds];
- mutex_unlock(&ctrl->lock);
- nvmet_req_complete(req, NVME_SC_INTERNAL | NVME_SC_DNR);
- }
-}
-
-static void nvmet_async_event_work(struct work_struct *work)
-{
- struct nvmet_ctrl *ctrl =
- container_of(work, struct nvmet_ctrl, async_event_work);
struct nvmet_async_event *aen;
struct nvmet_req *req;

@@ -163,16 +144,30 @@ static void nvmet_async_event_work(struct work_struct *work)
}

req = ctrl->async_event_cmds[--ctrl->nr_async_event_cmds];
- nvmet_set_result(req, nvmet_async_event_result(aen));
+ if (status == 0)
+ nvmet_set_result(req, nvmet_async_event_result(aen));

list_del(&aen->entry);
kfree(aen);

mutex_unlock(&ctrl->lock);
- nvmet_req_complete(req, 0);
+ nvmet_req_complete(req, status);
}
}

+static void nvmet_async_events_free(struct nvmet_ctrl *ctrl)
+{
+ nvmet_async_events_process(ctrl, NVME_SC_INTERNAL | NVME_SC_DNR);
+}
+
+static void nvmet_async_event_work(struct work_struct *work)
+{
+ struct nvmet_ctrl *ctrl =
+ container_of(work, struct nvmet_ctrl, async_event_work);
+
+ nvmet_async_events_process(ctrl, 0);
+}
+
void nvmet_add_async_event(struct nvmet_ctrl *ctrl, u8 event_type,
u8 event_info, u8 log_page)
{
--
2.16.4


2019-10-30 20:03:56

by Chaitanya Kulkarni

[permalink] [raw]
Subject: Re: [RFC] nvmet: Always remove processed AER elements from list

On 10/30/2019 08:24 AM, Daniel Wagner wrote:
> Hi,
>
> I've got following oops:
>
> PID: 79413 TASK: ffff92f03a814ec0 CPU: 19 COMMAND: "kworker/19:2"
> #0 [ffffa5308b8c3c58] machine_kexec at ffffffff8e05dd02
> #1 [ffffa5308b8c3ca8] __crash_kexec at ffffffff8e12102a
> #2 [ffffa5308b8c3d68] crash_kexec at ffffffff8e122019
> #3 [ffffa5308b8c3d80] oops_end at ffffffff8e02e091
> #4 [ffffa5308b8c3da0] general_protection at ffffffff8e8015c5
> [exception RIP: nvmet_async_event_work+94]
> RIP: ffffffffc0d9a80e RSP: ffffa5308b8c3e58 RFLAGS: 00010202
> RAX: dead000000000100 RBX: ffff92dcbc7464b0 RCX: 0000000000000002
> RDX: 0000000000040002 RSI: 38ffff92dc9814cf RDI: ffff92f217722f20
> RBP: ffff92dcbc746418 R8: 0000000000000000 R9: 0000000000000000
> R10: 000000000000035b R11: ffff92efb8dd2091 R12: ffff92dcbc7464a0
> R13: ffff92dbe03a5f29 R14: 0000000000000000 R15: 0ffff92f92f26864
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> #5 [ffffa5308b8c3e78] process_one_work at ffffffff8e0a3b0c
> #6 [ffffa5308b8c3eb8] worker_thread at ffffffff8e0a41e7
> #7 [ffffa5308b8c3f10] kthread at ffffffff8e0a93af
> #8 [ffffa5308b8c3f50] ret_from_fork at ffffffff8e800235
>
> this maps to nvmet_async_event_results. So it looks like this function
> access a stale aen pointer:
>
> static u32 nvmet_async_event_result(struct nvmet_async_event *aen)
> {
> return aen->event_type | (aen->event_info << 8) | (aen->log_page << 16);
> }
Can you please explain the test setup ? Is that coming from the tests
present in the blktests ? if so you can please provide test number ?

2019-10-31 07:08:55

by Johannes Thumshirn

[permalink] [raw]
Subject: Re: [RFC] nvmet: Always remove processed AER elements from list

On 2019-10-30 20:58, Chaitanya Kulkarni wrote:
> On 10/30/2019 08:24 AM, Daniel Wagner wrote:
>> Hi,
>>
>> I've got following oops:
>>
>> PID: 79413 TASK: ffff92f03a814ec0 CPU: 19 COMMAND: "kworker/19:2"
>> #0 [ffffa5308b8c3c58] machine_kexec at ffffffff8e05dd02
>> #1 [ffffa5308b8c3ca8] __crash_kexec at ffffffff8e12102a
>> #2 [ffffa5308b8c3d68] crash_kexec at ffffffff8e122019
>> #3 [ffffa5308b8c3d80] oops_end at ffffffff8e02e091
>> #4 [ffffa5308b8c3da0] general_protection at ffffffff8e8015c5
>> [exception RIP: nvmet_async_event_work+94]
>> RIP: ffffffffc0d9a80e RSP: ffffa5308b8c3e58 RFLAGS: 00010202
>> RAX: dead000000000100 RBX: ffff92dcbc7464b0 RCX:
>> 0000000000000002
>> RDX: 0000000000040002 RSI: 38ffff92dc9814cf RDI:
>> ffff92f217722f20
>> RBP: ffff92dcbc746418 R8: 0000000000000000 R9:
>> 0000000000000000
>> R10: 000000000000035b R11: ffff92efb8dd2091 R12:
>> ffff92dcbc7464a0
>> R13: ffff92dbe03a5f29 R14: 0000000000000000 R15:
>> 0ffff92f92f26864
>> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
>> #5 [ffffa5308b8c3e78] process_one_work at ffffffff8e0a3b0c
>> #6 [ffffa5308b8c3eb8] worker_thread at ffffffff8e0a41e7
>> #7 [ffffa5308b8c3f10] kthread at ffffffff8e0a93af
>> #8 [ffffa5308b8c3f50] ret_from_fork at ffffffff8e800235
>>
>> this maps to nvmet_async_event_results. So it looks like this function
>> access a stale aen pointer:
>>
>> static u32 nvmet_async_event_result(struct nvmet_async_event *aen)
>> {
>> return aen->event_type | (aen->event_info << 8) |
>> (aen->log_page << 16);
>> }
> Can you please explain the test setup ? Is that coming from the tests
> present in the blktests ? if so you can please provide test number ?

No unfortunately this is coming from a customer bug report. We _think_
we're having a race between AEN processing and nvmet_sq_destroy(), but
we're not 100% sure. Hence this RFC.


2019-10-31 14:53:41

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC] nvmet: Always remove processed AER elements from list

On Wed, Oct 30, 2019 at 04:24:18PM +0100, Daniel Wagner wrote:
> All async events are enqueued via nvmet_add_async_event() which
> updates the ctrl->async_event_cmds[] array and additionally an struct
> nvmet_async_event is added to the ctrl->async_events list.
>
> Under normal operations the nvmet_async_event_work() updates again the
> ctrl->async_event_cmds and removes the corresponding struct
> nvmet_async_event from the list again. Though nvmet_sq_destroy() could
> be called which calles nvmet_async_events_free() which only updates
> the ctrl->async_event_cmds[] array.
>
> Add a new function nvmet_async_events_process() which processes the
> async events and updates both array and the list. With this we avoid
> having two places where the array and list are modified.

I don't think this patch is correct. We can have AEN commands pending
that aren't used - that is the host sent the command, but the target
did not have even event yet. That means the command sits in
async_event_cmds, but there is no entry in >async_events for it yet.

That being said I think what we want is to do the loop in your new
nvmet_async_events_free first, but after that still call the loop in
the existing nvmet_async_events_free after that. That ensures we first
flush out everything in ->async_events, and then also return any
potential remaining entry.

2019-11-03 18:59:15

by Chaitanya Kulkarni

[permalink] [raw]
Subject: Re: [RFC] nvmet: Always remove processed AER elements from list

On 10/31/2019 07:51 AM, Christoph Hellwig wrote:
> On Wed, Oct 30, 2019 at 04:24:18PM +0100, Daniel Wagner wrote:
>> All async events are enqueued via nvmet_add_async_event() which
>> updates the ctrl->async_event_cmds[] array and additionally an struct
>> nvmet_async_event is added to the ctrl->async_events list.
>>
>> Under normal operations the nvmet_async_event_work() updates again the
>> ctrl->async_event_cmds and removes the corresponding struct
>> nvmet_async_event from the list again. Though nvmet_sq_destroy() could
>> be called which calles nvmet_async_events_free() which only updates
>> the ctrl->async_event_cmds[] array.
>>
>> Add a new function nvmet_async_events_process() which processes the
>> async events and updates both array and the list. With this we avoid
>> having two places where the array and list are modified.
>
> I don't think this patch is correct. We can have AEN commands pending
> that aren't used - that is the host sent the command, but the target
> did not have even event yet. That means the command sits in
> async_event_cmds, but there is no entry in >async_events for it yet.
>
> That being said I think what we want is to do the loop in your new
> nvmet_async_events_free first, but after that still call the loop in
> the existing nvmet_async_events_free after that. That ensures we first
> flush out everything in ->async_events, and then also return any
> potential remaining entry.
>
Something like following on the top of this patch ?
(compile tested only).

diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index b1b9dc58c3b4..36a859082846 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -153,6 +153,18 @@ static void nvmet_async_events_process(struct
nvmet_ctrl *ctrl, u16 status)
mutex_unlock(&ctrl->lock);
nvmet_req_complete(req, status);
}
+
+ while (1) {
+ mutex_lock(&ctrl->lock);
+ if (!ctrl->nr_async_event_cmds) {
+ mutex_unlock(&ctrl->lock);
+ return;
+ }
+
+ req = ctrl->async_event_cmds[--ctrl->nr_async_event_cmds];
+ mutex_unlock(&ctrl->lock);
+ nvmet_req_complete(req, NVME_SC_INTERNAL | NVME_SC_DNR);
+ }
}

static void nvmet_async_events_free(struct nvmet_ctrl *ctrl)

2019-11-03 19:50:44

by Chaitanya Kulkarni

[permalink] [raw]
Subject: Re: [RFC] nvmet: Always remove processed AER elements from list

It will not work as it will consume outstanding command
posted by host for which aen is not generated yet, and
when aen is generated it will not have command in the
async_event_cmds[].

On 11/03/2019 10:55 AM, Chaitanya Kulkarni wrote:
> Something like following on the top of this patch ?
> (compile tested only).
>
> diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
> index b1b9dc58c3b4..36a859082846 100644
> --- a/drivers/nvme/target/core.c
> +++ b/drivers/nvme/target/core.c
> @@ -153,6 +153,18 @@ static void nvmet_async_events_process(struct
> nvmet_ctrl *ctrl, u16 status)
> mutex_unlock(&ctrl->lock);
> nvmet_req_complete(req, status);
> }
> +
> + while (1) {
> + mutex_lock(&ctrl->lock);
> + if (!ctrl->nr_async_event_cmds) {
> + mutex_unlock(&ctrl->lock);
> + return;
> + }
> +
> + req = ctrl->async_event_cmds[--ctrl->nr_async_event_cmds];
> + mutex_unlock(&ctrl->lock);
> + nvmet_req_complete(req, NVME_SC_INTERNAL | NVME_SC_DNR);
> + }
> }
>
> static void nvmet_async_events_free(struct nvmet_ctrl *ctrl)
>

2019-11-03 20:20:30

by Chaitanya Kulkarni

[permalink] [raw]
Subject: Re: [RFC] nvmet: Always remove processed AER elements from list

Sent out a patch with fix, please have a look.

On 11/03/2019 11:48 AM, Chaitanya Kulkarni wrote:
> It will not work as it will consume outstanding command
> posted by host for which aen is not generated yet, and
> when aen is generated it will not have command in the
> async_event_cmds[].