LinuxLists.cc - [PATCH RT] nvdimm: make lane acquirement RT aware

2019-03-06 11:12:51

Subject: [PATCH RT] nvdimm: make lane acquirement RT aware

Currently, nvdimm driver isn't RT compatible.
nd_region_acquire_lane() disables preemption with get_cpu() which
causes "scheduling while atomic" spews on RT, when using fio to test
pmem as block device.

In this change, we replace get_cpu/put_cpu with local_lock_cpu/
local_unlock_cpu, and introduce per CPU variable "ndl_local_lock".
Due to preemption on RT, this lock can avoid race condition for the
same lane on the same CPU. When CPU number is greater than the lane
number, lane can be shared among CPUs. "ndl_lock->lock" is used to
protect the lane in this situation.

This patch is derived from Dan Williams and Pankaj Gupta's proposal from
https://www.mail-archive.com/[email protected]/msg13359.html
and https://www.spinics.net/lists/linux-rt-users/msg20280.html.
Many thanks to them.

Cc: Dan Williams <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Cc: linux-rt-users <[email protected]>
Cc: linux-nvdimm <[email protected]>
Signed-off-by: Yongxin Liu <[email protected]>
---
drivers/nvdimm/region_devs.c | 40 +++++++++++++++++++---------------------
1 file changed, 19 insertions(+), 21 deletions(-)

diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index fa37afcd43ff..6c5388cf2477 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -18,9 +18,13 @@
#include <linux/sort.h>
#include <linux/io.h>
#include <linux/nd.h>
+#include <linux/locallock.h>
#include "nd-core.h"
#include "nd.h"

+/* lock for tasks on the same CPU to sequence the access to the lane */
+static DEFINE_LOCAL_IRQ_LOCK(ndl_local_lock);
+
/*
* For readq() and writeq() on 32-bit builds, the hi-lo, lo-hi order is
* irrelevant.
@@ -935,18 +939,15 @@ int nd_blk_region_init(struct nd_region *nd_region)
unsigned int nd_region_acquire_lane(struct nd_region *nd_region)
{
unsigned int cpu, lane;
+ struct nd_percpu_lane *ndl_lock, *ndl_count;

- cpu = get_cpu();
- if (nd_region->num_lanes < nr_cpu_ids) {
- struct nd_percpu_lane *ndl_lock, *ndl_count;
+ cpu = local_lock_cpu(ndl_local_lock);

- lane = cpu % nd_region->num_lanes;
- ndl_count = per_cpu_ptr(nd_region->lane, cpu);
- ndl_lock = per_cpu_ptr(nd_region->lane, lane);
- if (ndl_count->count++ == 0)
- spin_lock(&ndl_lock->lock);
- } else
- lane = cpu;
+ lane = cpu % nd_region->num_lanes;
+ ndl_count = per_cpu_ptr(nd_region->lane, cpu);
+ ndl_lock = per_cpu_ptr(nd_region->lane, lane);
+ if (ndl_count->count++ == 0)
+ spin_lock(&ndl_lock->lock);

return lane;
}
@@ -954,17 +955,14 @@ EXPORT_SYMBOL(nd_region_acquire_lane);

void nd_region_release_lane(struct nd_region *nd_region, unsigned int lane)
{
- if (nd_region->num_lanes < nr_cpu_ids) {
- unsigned int cpu = get_cpu();
- struct nd_percpu_lane *ndl_lock, *ndl_count;
-
- ndl_count = per_cpu_ptr(nd_region->lane, cpu);
- ndl_lock = per_cpu_ptr(nd_region->lane, lane);
- if (--ndl_count->count == 0)
- spin_unlock(&ndl_lock->lock);
- put_cpu();
- }
- put_cpu();
+ struct nd_percpu_lane *ndl_lock, *ndl_count;
+ unsigned int cpu = smp_processor_id();
+
+ ndl_count = per_cpu_ptr(nd_region->lane, cpu);
+ ndl_lock = per_cpu_ptr(nd_region->lane, lane);
+ if (--ndl_count->count == 0)
+ spin_unlock(&ndl_lock->lock);
+ local_unlock_cpu(ndl_local_lock);
}
EXPORT_SYMBOL(nd_region_release_lane);

--
2.14.4

2019-03-06 19:15:46

by Dan Williams

[permalink] [raw]

Subject: Re: [PATCH RT] nvdimm: make lane acquirement RT aware

On Wed, Mar 6, 2019 at 2:05 AM Yongxin Liu <[email protected]> wrote:
>
> Currently, nvdimm driver isn't RT compatible.
> nd_region_acquire_lane() disables preemption with get_cpu() which
> causes "scheduling while atomic" spews on RT, when using fio to test
> pmem as block device.
>
> In this change, we replace get_cpu/put_cpu with local_lock_cpu/
> local_unlock_cpu, and introduce per CPU variable "ndl_local_lock".
> Due to preemption on RT, this lock can avoid race condition for the
> same lane on the same CPU. When CPU number is greater than the lane
> number, lane can be shared among CPUs. "ndl_lock->lock" is used to
> protect the lane in this situation.
>
> This patch is derived from Dan Williams and Pankaj Gupta's proposal from
> https://www.mail-archive.com/[email protected]/msg13359.html
> and https://www.spinics.net/lists/linux-rt-users/msg20280.html.
> Many thanks to them.
>
> Cc: Dan Williams <[email protected]>
> Cc: Pankaj Gupta <[email protected]>
> Cc: linux-rt-users <[email protected]>
> Cc: linux-nvdimm <[email protected]>
> Signed-off-by: Yongxin Liu <[email protected]>

Looks ok to me in concept.

Acked-by: Dan Williams <[email protected]>

2019-03-07 14:35:00

by Sebastian Andrzej Siewior

[permalink] [raw]

Subject: Re: [PATCH RT] nvdimm: make lane acquirement RT aware

On 2019-03-06 17:57:09 [+0800], Yongxin Liu wrote:
> In this change, we replace get_cpu/put_cpu with local_lock_cpu/
> local_unlock_cpu, and introduce per CPU variable "ndl_local_lock".
> Due to preemption on RT, this lock can avoid race condition for the
> same lane on the same CPU. When CPU number is greater than the lane
> number, lane can be shared among CPUs. "ndl_lock->lock" is used to
> protect the lane in this situation.

so what was the reason that get_cpu() can't be replaced with
raw_smp_processor_id()?

Sebastian

2019-03-08 00:10:59

by Yongxin Liu

[permalink] [raw]

Subject: RE: [PATCH RT] nvdimm: make lane acquirement RT aware

> -----Original Message-----
> From: [email protected] [mailto:linux-kernel-
> [email protected]] On Behalf Of Sebastian Andrzej Siewior
> Sent: Thursday, March 7, 2019 22:34
> To: Liu, Yongxin
> Cc: [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Gortmaker, Paul; [email protected]
> Subject: Re: [PATCH RT] nvdimm: make lane acquirement RT aware
>
> On 2019-03-06 17:57:09 [+0800], Yongxin Liu wrote:
> > In this change, we replace get_cpu/put_cpu with local_lock_cpu/
> > local_unlock_cpu, and introduce per CPU variable "ndl_local_lock".
> > Due to preemption on RT, this lock can avoid race condition for the
> > same lane on the same CPU. When CPU number is greater than the lane
> > number, lane can be shared among CPUs. "ndl_lock->lock" is used to
> > protect the lane in this situation.
>
> so what was the reason that get_cpu() can't be replaced with
> raw_smp_processor_id()?
>
> Sebastian

The lane is critical resource which needs to be protected. One CPU can use only one
lane. If CPU number is greater than the number of total lane, the lane can be shared
among CPUs.

In non-RT kernel, get_cpu() disable preemption by calling preempt_disable() first.
Only one thread on the same CPU can get the lane.

In RT kernel, if we only use raw_smp_processor_id(), this doesn't protect the lane.
Thus two threads on the same CPU can get the same lane at the same time.

In this patch, two-level lock can avoid race condition for the lane.

CPU A CPU B (B == A % num_lanes)

task A1 task A2 task B1 task B2
| | | |
|__________| |__________|
| |
ndl_local_lock ndl_local_lock
| |
|______________________|
|
|
ndl_lock->lock
|
|
lane

Thanks,
Yongxin

2019-03-08 06:31:48

by Pankaj Gupta

[permalink] [raw]

Subject: Re: [PATCH RT] nvdimm: make lane acquirement RT aware

> Currently, nvdimm driver isn't RT compatible.
> nd_region_acquire_lane() disables preemption with get_cpu() which
> causes "scheduling while atomic" spews on RT, when using fio to test
> pmem as block device.
>
> In this change, we replace get_cpu/put_cpu with local_lock_cpu/
> local_unlock_cpu, and introduce per CPU variable "ndl_local_lock".
> Due to preemption on RT, this lock can avoid race condition for the
> same lane on the same CPU. When CPU number is greater than the lane
> number, lane can be shared among CPUs. "ndl_lock->lock" is used to
> protect the lane in this situation.
>
> This patch is derived from Dan Williams and Pankaj Gupta's proposal from
> https://www.mail-archive.com/[email protected]/msg13359.html
> and https://www.spinics.net/lists/linux-rt-users/msg20280.html.
> Many thanks to them.
>
> Cc: Dan Williams <[email protected]>
> Cc: Pankaj Gupta <[email protected]>
> Cc: linux-rt-users <[email protected]>
> Cc: linux-nvdimm <[email protected]>
> Signed-off-by: Yongxin Liu <[email protected]>

This patch looks good to me.

Acked-by: Pankaj Gupta <[email protected]>

> ---
> drivers/nvdimm/region_devs.c | 40 +++++++++++++++++++---------------------
> 1 file changed, 19 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
> index fa37afcd43ff..6c5388cf2477 100644
> --- a/drivers/nvdimm/region_devs.c
> +++ b/drivers/nvdimm/region_devs.c
> @@ -18,9 +18,13 @@
> #include <linux/sort.h>
> #include <linux/io.h>
> #include <linux/nd.h>
> +#include <linux/locallock.h>
> #include "nd-core.h"
> #include "nd.h"
>
> +/* lock for tasks on the same CPU to sequence the access to the lane */
> +static DEFINE_LOCAL_IRQ_LOCK(ndl_local_lock);
> +
> /*
> * For readq() and writeq() on 32-bit builds, the hi-lo, lo-hi order is
> * irrelevant.
> @@ -935,18 +939,15 @@ int nd_blk_region_init(struct nd_region *nd_region)
> unsigned int nd_region_acquire_lane(struct nd_region *nd_region)
> {
> unsigned int cpu, lane;
> + struct nd_percpu_lane *ndl_lock, *ndl_count;
>
> - cpu = get_cpu();
> - if (nd_region->num_lanes < nr_cpu_ids) {
> - struct nd_percpu_lane *ndl_lock, *ndl_count;
> + cpu = local_lock_cpu(ndl_local_lock);
>
> - lane = cpu % nd_region->num_lanes;
> - ndl_count = per_cpu_ptr(nd_region->lane, cpu);
> - ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> - if (ndl_count->count++ == 0)
> - spin_lock(&ndl_lock->lock);
> - } else
> - lane = cpu;
> + lane = cpu % nd_region->num_lanes;
> + ndl_count = per_cpu_ptr(nd_region->lane, cpu);
> + ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> + if (ndl_count->count++ == 0)
> + spin_lock(&ndl_lock->lock);
>
> return lane;
> }
> @@ -954,17 +955,14 @@ EXPORT_SYMBOL(nd_region_acquire_lane);
>
> void nd_region_release_lane(struct nd_region *nd_region, unsigned int lane)
> {
> - if (nd_region->num_lanes < nr_cpu_ids) {
> - unsigned int cpu = get_cpu();
> - struct nd_percpu_lane *ndl_lock, *ndl_count;
> -
> - ndl_count = per_cpu_ptr(nd_region->lane, cpu);
> - ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> - if (--ndl_count->count == 0)
> - spin_unlock(&ndl_lock->lock);
> - put_cpu();
> - }
> - put_cpu();
> + struct nd_percpu_lane *ndl_lock, *ndl_count;
> + unsigned int cpu = smp_processor_id();
> +
> + ndl_count = per_cpu_ptr(nd_region->lane, cpu);
> + ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> + if (--ndl_count->count == 0)
> + spin_unlock(&ndl_lock->lock);
> + local_unlock_cpu(ndl_local_lock);
> }
> EXPORT_SYMBOL(nd_region_release_lane);
>
> --
> 2.14.4
>
>

2019-03-08 09:42:37

by Sebastian Andrzej Siewior

[permalink] [raw]

Subject: Re: [PATCH RT] nvdimm: make lane acquirement RT aware

On 2019-03-08 00:07:41 [+0000], Liu, Yongxin wrote:
> The lane is critical resource which needs to be protected. One CPU can use only one
> lane. If CPU number is greater than the number of total lane, the lane can be shared
> among CPUs.
>
> In non-RT kernel, get_cpu() disable preemption by calling preempt_disable() first.
> Only one thread on the same CPU can get the lane.
>
> In RT kernel, if we only use raw_smp_processor_id(), this doesn't protect the lane.
> Thus two threads on the same CPU can get the same lane at the same time.
>
> In this patch, two-level lock can avoid race condition for the lane.

but you still have the ndl_lock->lock which protects the resource. So in
the unlikely (but possible event) that you switch CPUs after obtaining
the CPU number you block on the lock. No harm is done, right?

> Thanks,
> Yongxin

Sebastian

2019-03-11 01:14:00

by Yongxin Liu

[permalink] [raw]

Subject: RE: [PATCH RT] nvdimm: make lane acquirement RT aware

> -----Original Message-----
> From: [email protected] [mailto:linux-kernel-
> [email protected]] On Behalf Of Sebastian Andrzej Siewior
> Sent: Friday, March 8, 2019 17:42
> To: Liu, Yongxin
> Cc: [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Gortmaker, Paul; [email protected]
> Subject: Re: [PATCH RT] nvdimm: make lane acquirement RT aware
>
> On 2019-03-08 00:07:41 [+0000], Liu, Yongxin wrote:
> > The lane is critical resource which needs to be protected. One CPU can
> use only one
> > lane. If CPU number is greater than the number of total lane, the lane
> can be shared
> > among CPUs.
> >
> > In non-RT kernel, get_cpu() disable preemption by calling
> preempt_disable() first.
> > Only one thread on the same CPU can get the lane.
> >
> > In RT kernel, if we only use raw_smp_processor_id(), this doesn't
> protect the lane.
> > Thus two threads on the same CPU can get the same lane at the same time.
> >
> > In this patch, two-level lock can avoid race condition for the lane.
>
> but you still have the ndl_lock->lock which protects the resource. So in
> the unlikely (but possible event) that you switch CPUs after obtaining
> the CPU number you block on the lock. No harm is done, right?

The resource "lane" can be acquired recursively, so "ndl_lock->lock" is a conditional lock.

ndl_count->count is per CPU.
ndl_lock->lock is per lane.

Here is an example:
Thread A on CPU 5 --> nd_region_acquire_lane --> lane# 5 --> get "ndl_lock->lock"
--> nd_region_acquire_lane --> lane# 5 --> bypass "ndl_lock->lock" due to "ndl_count->count++".

Thread B on CPU 5 --> nd_region_acquire_lane --> lane# 5 --> bypass "ndl_lock->lock" ("ndl_count->count"
was changed by Thread A)

If we use raw_smp_processor_id(), no matter which CPU the thread was migrated to,
if there is another thread running on the old CPU, there will be race condition
due to per CPU variable "ndl_count->count".

Thanks,
Yongxin

>
> Sebastian

2019-03-15 16:43:33

by Sebastian Andrzej Siewior

[permalink] [raw]

Subject: Re: [PATCH RT] nvdimm: make lane acquirement RT aware

On 2019-03-11 00:44:58 [+0000], Liu, Yongxin wrote:
> > but you still have the ndl_lock->lock which protects the resource. So in
> > the unlikely (but possible event) that you switch CPUs after obtaining
> > the CPU number you block on the lock. No harm is done, right?
>
> The resource "lane" can be acquired recursively, so "ndl_lock->lock" is a conditional lock.
>
> ndl_count->count is per CPU.
> ndl_lock->lock is per lane.
>
> Here is an example:
> Thread A on CPU 5 --> nd_region_acquire_lane --> lane# 5 --> get "ndl_lock->lock"
> --> nd_region_acquire_lane --> lane# 5 --> bypass "ndl_lock->lock" due to "ndl_count->count++".
>
> Thread B on CPU 5 --> nd_region_acquire_lane --> lane# 5 --> bypass "ndl_lock->lock" ("ndl_count->count"
> was changed by Thread A)
>
> If we use raw_smp_processor_id(), no matter which CPU the thread was migrated to,
> if there is another thread running on the old CPU, there will be race condition
> due to per CPU variable "ndl_count->count".

so I've been looking at it again. The recursive locking could have been
solved better. Like the local_lock() on -RT is doing it.
Given that you lock with preempt_disable() there should be no in-IRQ
usage.
But in the "nd_region->num_lanes >= nr_cpu_ids" case you don't take any
locks. That would be a problem with raw_smp_processor_id() approach.

So what about the completely untested patch here:

diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h
index 379bf4305e615..98c2e9df4b2e4 100644
--- a/drivers/nvdimm/nd.h
+++ b/drivers/nvdimm/nd.h
@@ -109,7 +109,8 @@ unsigned sizeof_namespace_label(struct nvdimm_drvdata *ndd);
res; res = next, next = next ? next->sibling : NULL)

struct nd_percpu_lane {
- int count;
+ struct task_struct *owner;
+ int nestcnt;
spinlock_t lock;
};

diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index e2818f94f2928..8a62f9833513f 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -946,19 +946,17 @@ int nd_blk_region_init(struct nd_region *nd_region)
*/
unsigned int nd_region_acquire_lane(struct nd_region *nd_region)
{
+ struct nd_percpu_lane *ndl_lock;
unsigned int cpu, lane;

- cpu = get_cpu();
- if (nd_region->num_lanes < nr_cpu_ids) {
- struct nd_percpu_lane *ndl_lock, *ndl_count;
-
- lane = cpu % nd_region->num_lanes;
- ndl_count = per_cpu_ptr(nd_region->lane, cpu);
- ndl_lock = per_cpu_ptr(nd_region->lane, lane);
- if (ndl_count->count++ == 0)
- spin_lock(&ndl_lock->lock);
- } else
- lane = cpu;
+ cpu = raw_smp_processor_id();
+ lane = cpu % nd_region->num_lanes;
+ ndl_lock = per_cpu_ptr(nd_region->lane, lane);
+ if (ndl_lock->owner != current) {
+ spin_lock(&ndl_lock->lock);
+ ndl_lock->owner = current;
+ }
+ ndl_lock->nestcnt++;

return lane;
}
@@ -966,17 +964,16 @@ EXPORT_SYMBOL(nd_region_acquire_lane);

void nd_region_release_lane(struct nd_region *nd_region, unsigned int lane)
{
- if (nd_region->num_lanes < nr_cpu_ids) {
- unsigned int cpu = get_cpu();
- struct nd_percpu_lane *ndl_lock, *ndl_count;
+ struct nd_percpu_lane *ndl_lock;

- ndl_count = per_cpu_ptr(nd_region->lane, cpu);
- ndl_lock = per_cpu_ptr(nd_region->lane, lane);
- if (--ndl_count->count == 0)
- spin_unlock(&ndl_lock->lock);
- put_cpu();
- }
- put_cpu();
+ ndl_lock = per_cpu_ptr(nd_region->lane, lane);
+ WARN_ON(ndl_lock->nestcnt == 0);
+ WARN_ON(ndl_lock->owner != current);
+ if (--ndl_lock->nestcnt)
+ return;
+
+ ndl_lock->owner = NULL;
+ spin_unlock(&ndl_lock->lock);
}
EXPORT_SYMBOL(nd_region_release_lane);

@@ -1042,7 +1039,8 @@ static struct nd_region *nd_region_create(struct nvdimm_bus *nvdimm_bus,

ndl = per_cpu_ptr(nd_region->lane, i);
spin_lock_init(&ndl->lock);
- ndl->count = 0;
+ ndl->owner = NULL;
+ ndl->nestcnt = 0;
}

for (i = 0; i < ndr_desc->num_mappings; i++) {

> Thanks,
> Yongxin

Sebastian

2019-03-18 01:44:29

by Yongxin Liu

[permalink] [raw]

Subject: RE: [PATCH RT] nvdimm: make lane acquirement RT aware

> -----Original Message-----
> From: [email protected] [mailto:linux-kernel-
> [email protected]] On Behalf Of Sebastian Andrzej Siewior
> Sent: Saturday, March 16, 2019 00:43
> To: Liu, Yongxin
> Cc: [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Gortmaker, Paul; [email protected]
> Subject: Re: [PATCH RT] nvdimm: make lane acquirement RT aware
>
> On 2019-03-11 00:44:58 [+0000], Liu, Yongxin wrote:
> > > but you still have the ndl_lock->lock which protects the resource. So
> in
> > > the unlikely (but possible event) that you switch CPUs after
> obtaining
> > > the CPU number you block on the lock. No harm is done, right?
> >
> > The resource "lane" can be acquired recursively, so "ndl_lock->lock" is
> a conditional lock.
> >
> > ndl_count->count is per CPU.
> > ndl_lock->lock is per lane.
> >
> > Here is an example:
> > Thread A on CPU 5 --> nd_region_acquire_lane --> lane# 5 --> get
> "ndl_lock->lock"
> > --> nd_region_acquire_lane --> lane# 5 --> bypass "ndl_lock->lock" due
> to "ndl_count->count++".
> >
> > Thread B on CPU 5 --> nd_region_acquire_lane --> lane# 5 --> bypass
> "ndl_lock->lock" ("ndl_count->count"
> > was changed by Thread A)
> >
> > If we use raw_smp_processor_id(), no matter which CPU the thread was
> migrated to,
> > if there is another thread running on the old CPU, there will be race
> condition
> > due to per CPU variable "ndl_count->count".
>
> so I've been looking at it again. The recursive locking could have been
> solved better. Like the local_lock() on -RT is doing it.
> Given that you lock with preempt_disable() there should be no in-IRQ
> usage.
> But in the "nd_region->num_lanes >= nr_cpu_ids" case you don't take any
> locks. That would be a problem with raw_smp_processor_id() approach.
>
> So what about the completely untested patch here:
>
> diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h
> index 379bf4305e615..98c2e9df4b2e4 100644
> --- a/drivers/nvdimm/nd.h
> +++ b/drivers/nvdimm/nd.h
> @@ -109,7 +109,8 @@ unsigned sizeof_namespace_label(struct nvdimm_drvdata
> *ndd);
> res; res = next, next = next ? next->sibling : NULL)
>
> struct nd_percpu_lane {
> - int count;
> + struct task_struct *owner;
> + int nestcnt;
> spinlock_t lock;
> };
>
> diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
> index e2818f94f2928..8a62f9833513f 100644
> --- a/drivers/nvdimm/region_devs.c
> +++ b/drivers/nvdimm/region_devs.c
> @@ -946,19 +946,17 @@ int nd_blk_region_init(struct nd_region *nd_region)
> */
> unsigned int nd_region_acquire_lane(struct nd_region *nd_region)
> {
> + struct nd_percpu_lane *ndl_lock;
> unsigned int cpu, lane;
>
> - cpu = get_cpu();
> - if (nd_region->num_lanes < nr_cpu_ids) {
> - struct nd_percpu_lane *ndl_lock, *ndl_count;
> -
> - lane = cpu % nd_region->num_lanes;
> - ndl_count = per_cpu_ptr(nd_region->lane, cpu);
> - ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> - if (ndl_count->count++ == 0)
> - spin_lock(&ndl_lock->lock);
> - } else
> - lane = cpu;
> + cpu = raw_smp_processor_id();
> + lane = cpu % nd_region->num_lanes;
> + ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> + if (ndl_lock->owner != current) {
> + spin_lock(&ndl_lock->lock);
> + ndl_lock->owner = current;
> + }
> + ndl_lock->nestcnt++;
>
> return lane;
> }
> @@ -966,17 +964,16 @@ EXPORT_SYMBOL(nd_region_acquire_lane);
>
> void nd_region_release_lane(struct nd_region *nd_region, unsigned int
> lane)
> {
> - if (nd_region->num_lanes < nr_cpu_ids) {
> - unsigned int cpu = get_cpu();
> - struct nd_percpu_lane *ndl_lock, *ndl_count;
> + struct nd_percpu_lane *ndl_lock;
>
> - ndl_count = per_cpu_ptr(nd_region->lane, cpu);
> - ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> - if (--ndl_count->count == 0)
> - spin_unlock(&ndl_lock->lock);
> - put_cpu();
> - }
> - put_cpu();
> + ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> + WARN_ON(ndl_lock->nestcnt == 0);
> + WARN_ON(ndl_lock->owner != current);
> + if (--ndl_lock->nestcnt)
> + return;
> +
> + ndl_lock->owner = NULL;
> + spin_unlock(&ndl_lock->lock);
> }
> EXPORT_SYMBOL(nd_region_release_lane);
>
> @@ -1042,7 +1039,8 @@ static struct nd_region *nd_region_create(struct
> nvdimm_bus *nvdimm_bus,
>
> ndl = per_cpu_ptr(nd_region->lane, i);
> spin_lock_init(&ndl->lock);
> - ndl->count = 0;
> + ndl->owner = NULL;
> + ndl->nestcnt = 0;
> }
>
> for (i = 0; i < ndr_desc->num_mappings; i++) {
>
> > Thanks,
> > Yongxin
>

Consider the recursive call to nd_region_acquire_lane() in the following situation.
Will there be a dead lock?

Thread A Thread B
| |
| |
CPU 1 CPU 2
| |
| |
get lock for Lane 1 get lock for Lane 2
| |
| |
migrate to CPU 2 migrate to CPU 1
| |
| |
wait lock for Lane 2 wait lock for Lane 1
| |
| |
_____________________________
|
dead lock ?

Thanks,
Yognxin

> Sebastian

2019-03-18 11:41:17

by Sebastian Andrzej Siewior

[permalink] [raw]

Subject: Re: [PATCH RT] nvdimm: make lane acquirement RT aware

On 2019-03-18 01:41:10 [+0000], Liu, Yongxin wrote:
>
> Consider the recursive call to nd_region_acquire_lane() in the following situation.
> Will there be a dead lock?
>
>
> Thread A Thread B
> | |
> | |
> CPU 1 CPU 2
> | |
> | |
> get lock for Lane 1 get lock for Lane 2
> | |
> | |
> migrate to CPU 2 migrate to CPU 1
> | |
> | |
> wait lock for Lane 2 wait lock for Lane 1
> | |
> | |
> _____________________________
> |
> dead lock ?

Bummer. That would dead lock indeed.
Is it easily possible to recognize the recursive case?

>
> Thanks,
> Yognxin

Sebastian

2019-03-18 11:51:20

by Yongxin Liu

[permalink] [raw]

Subject: RE: [PATCH RT] nvdimm: make lane acquirement RT aware

> -----Original Message-----
> From: [email protected] [mailto:linux-kernel-
> [email protected]] On Behalf Of Sebastian Andrzej Siewior
> Sent: Monday, March 18, 2019 19:40
> To: Liu, Yongxin
> Cc: [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Gortmaker, Paul; [email protected]
> Subject: Re: [PATCH RT] nvdimm: make lane acquirement RT aware
>
> On 2019-03-18 01:41:10 [+0000], Liu, Yongxin wrote:
> >
> > Consider the recursive call to nd_region_acquire_lane() in the
> following situation.
> > Will there be a dead lock?
> >
> >
> > Thread A Thread B
> > | |
> > | |
> > CPU 1 CPU 2
> > | |
> > | |
> > get lock for Lane 1 get lock for Lane 2
> > | |
> > | |
> > migrate to CPU 2 migrate to CPU 1
> > | |
> > | |
> > wait lock for Lane 2 wait lock for Lane 1
> > | |
> > | |
> > _____________________________
> > |
> > dead lock ?
>
> Bummer. That would dead lock indeed.
> Is it easily possible to recognize the recursive case?

Not easily. I don't have test case for recursive call.
For now, just code analysis.

Yongxin

> >
> > Thanks,
> > Yognxin
>
> Sebastian

2019-03-28 17:40:17

by Sebastian Andrzej Siewior

[permalink] [raw]

Subject: Re: [PATCH RT] nvdimm: make lane acquirement RT aware

On 2019-03-18 11:48:28 [+0000], Liu, Yongxin wrote:
>
> >
> > Bummer. That would dead lock indeed.
> > Is it easily possible to recognize the recursive case?
>
> Not easily. I don't have test case for recursive call.
> For now, just code analysis.

So I've been playing with qemu's nvdimm device. So I *think* the
recursive case is here not possible because qemu only supports pmem
while it would require the blk mode to trigger it. It is just a wild
guess…

On top of qemu's nvdimm device I can create a block device via
ndctl create-namespace namespace0.0 --mode=sector

and then I trigger the code path in question.

I would *really* prefer to understand the recursive case and avoid it.
That way the recursive case is explicitly known and uses another path.
The lock can then be always acquired which gives you always lockdep
coverage (which is now missing unless you have more LANEs than CPUs).

The local_lock thingy is completely unneeded: a simple get_cpu_light()
would do the job.

> Yongxin

Sebastian