When running blktest's nvme/005 with a lockdep enabled kernel the test
case fails due to the following lockdep splat in dmesg:
[ 18.206166] =============================
[ 18.207286] WARNING: suspicious RCU usage
[ 18.208417] 4.17.0-rc5 #881 Not tainted
[ 18.209487] -----------------------------
[ 18.210612] drivers/nvme/host/nvme.h:457 suspicious rcu_dereference_check() usage!
[ 18.213486]
[ 18.213486] other info that might help us debug this:
[ 18.213486]
[ 18.214745]
[ 18.214745] rcu_scheduler_active = 2, debug_locks = 1
[ 18.215798] 3 locks held by kworker/u32:5/1102:
[ 18.216535] #0: (ptrval) ((wq_completion)"nvme-wq"){+.+.}, at: process_one_work+0x152/0x5c0
[ 18.217983] #1: (ptrval) ((work_completion)(&ctrl->scan_work)){+.+.}, at: process_one_work+0x152/0x5c0
[ 18.219584] #2: (ptrval) (&subsys->lock#2){+.+.}, at: nvme_ns_remove+0x43/0x1c0 [nvme_core]
[ 18.221037]
[ 18.221037] stack backtrace:
[ 18.221721] CPU: 12 PID: 1102 Comm: kworker/u32:5 Not tainted 4.17.0-rc5 #881
[ 18.222830] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
[ 18.224451] Workqueue: nvme-wq nvme_scan_work [nvme_core]
[ 18.225308] Call Trace:
[ 18.225704] dump_stack+0x78/0xb3
[ 18.226224] nvme_ns_remove+0x1a3/0x1c0 [nvme_core]
[ 18.226975] nvme_validate_ns+0x87/0x850 [nvme_core]
[ 18.227749] ? blk_queue_exit+0x69/0x110
[ 18.228358] ? blk_queue_exit+0x81/0x110
[ 18.228960] ? direct_make_request+0x1a0/0x1a0
[ 18.229649] nvme_scan_work+0x212/0x2d0 [nvme_core]
[ 18.230411] process_one_work+0x1d8/0x5c0
[ 18.231037] ? process_one_work+0x152/0x5c0
[ 18.231705] worker_thread+0x45/0x3e0
[ 18.232282] kthread+0x101/0x140
[ 18.232788] ? process_one_work+0x5c0/0x5c0
The only caller of nvme_mpath_clear_current_path() is nvme_ns_remove()
which holds the subsys lock so it's likely a false positive, but using
rcu_dereference_protected() tells lockdep we're holding the lock.
Fixes: 32acab3181c7 ("nvme: implement multipath access to nvme subsystems")
Signed-off-by: Johannes Thumshirn <[email protected]>
---
drivers/nvme/host/nvme.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 17d2f7cf3fed..ca034434ebb9 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -22,6 +22,7 @@
#include <linux/lightnvm.h>
#include <linux/sed-opal.h>
#include <linux/fault-inject.h>
+#include <linux/rcupdate.h>
extern unsigned int nvme_io_timeout;
#define NVME_IO_TIMEOUT (nvme_io_timeout * HZ)
@@ -454,7 +455,9 @@ static inline void nvme_mpath_clear_current_path(struct nvme_ns *ns)
{
struct nvme_ns_head *head = ns->head;
- if (head && ns == srcu_dereference(head->current_path, &head->srcu))
+ if (head &&
+ ns == rcu_dereference_protected(head->current_path,
+ lockdep_is_held(&ns->ctrl->subsys->lock)))
rcu_assign_pointer(head->current_path, NULL);
}
struct nvme_ns *nvme_find_path(struct nvme_ns_head *head);
--
2.16.3
> extern unsigned int nvme_io_timeout;
> #define NVME_IO_TIMEOUT (nvme_io_timeout * HZ)
> @@ -454,7 +455,9 @@ static inline void nvme_mpath_clear_current_path(struct nvme_ns *ns)
> {
> struct nvme_ns_head *head = ns->head;
>
> - if (head && ns == srcu_dereference(head->current_path, &head->srcu))
> + if (head &&
> + ns == rcu_dereference_protected(head->current_path,
> + lockdep_is_held(&ns->ctrl->subsys->lock)))
> rcu_assign_pointer(head->current_path, NULL);
> }
> struct nvme_ns *nvme_find_path(struct nvme_ns_head *head);
We don't really dereference it at all in fact, but just check the
pointers for equality. I wonder if there is a better way to do this,
as my ANA patches add a caller without the lock (and withou SRU
protection either now that I think of it) - for a pure pointer compare
we really should not need any sort of protection.
On Mon, May 14, 2018 at 05:42:30AM -0700, Christoph Hellwig wrote:
> > extern unsigned int nvme_io_timeout;
> > #define NVME_IO_TIMEOUT (nvme_io_timeout * HZ)
> > @@ -454,7 +455,9 @@ static inline void nvme_mpath_clear_current_path(struct nvme_ns *ns)
> > {
> > struct nvme_ns_head *head = ns->head;
> >
> > - if (head && ns == srcu_dereference(head->current_path, &head->srcu))
> > + if (head &&
> > + ns == rcu_dereference_protected(head->current_path,
> > + lockdep_is_held(&ns->ctrl->subsys->lock)))
> > rcu_assign_pointer(head->current_path, NULL);
> > }
> > struct nvme_ns *nvme_find_path(struct nvme_ns_head *head);
>
> We don't really dereference it at all in fact, but just check the
> pointers for equality. I wonder if there is a better way to do this,
> as my ANA patches add a caller without the lock (and withou SRU
> protection either now that I think of it) - for a pure pointer compare
> we really should not need any sort of protection.
Uff maybe, but are you sure a comparison of two pointer is always
atomic (on all architectures)?
Paul, can you shed some light on us mere mortal, whether the above
rcu_dereference_protected() is needed or if a simple ns ==
head->current_path is sufficient.
Thanks,
Johannes
--
Johannes Thumshirn Storage
[email protected] +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: Felix Imend?rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N?rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
On Mon, May 14, 2018 at 05:42:30AM -0700, Christoph Hellwig wrote:
> > extern unsigned int nvme_io_timeout;
> > #define NVME_IO_TIMEOUT (nvme_io_timeout * HZ)
> > @@ -454,7 +455,9 @@ static inline void nvme_mpath_clear_current_path(struct nvme_ns *ns)
> > {
> > struct nvme_ns_head *head = ns->head;
> >
> > - if (head && ns == srcu_dereference(head->current_path, &head->srcu))
> > + if (head &&
> > + ns == rcu_dereference_protected(head->current_path,
> > + lockdep_is_held(&ns->ctrl->subsys->lock)))
> > rcu_assign_pointer(head->current_path, NULL);
> > }
> > struct nvme_ns *nvme_find_path(struct nvme_ns_head *head);
>
> We don't really dereference it at all in fact, but just check the
> pointers for equality. I wonder if there is a better way to do this,
> as my ANA patches add a caller without the lock (and withou SRU
> protection either now that I think of it) - for a pure pointer compare
> we really should not need any sort of protection.
If you are just looking at the value of an RCU-protected pointer, then
using rcu_access_pointer() will cause RCU to just read out the value
and otherwise keeps its mouth shut.
If you use rcu_access_pointer() and later dereference the value without
protection, you will of course get what you deserve, good and hard. ;-)
Thanx, Paul
On Mon, May 14, 2018 at 06:31:05AM -0700, Paul E. McKenney wrote:
> If you are just looking at the value of an RCU-protected pointer, then
> using rcu_access_pointer() will cause RCU to just read out the value
> and otherwise keeps its mouth shut.
>
> If you use rcu_access_pointer() and later dereference the value without
> protection, you will of course get what you deserve, good and hard. ;-)
Thanks Paul.
Christoph, I'll be sending the v2 probably tomorrow as I have more
lockdep fixes for nvme in the pipe and I'll send them out as a
complete series.
--
Johannes Thumshirn Storage
[email protected] +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: Felix Imend?rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N?rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
On Mon, May 14, 2018 at 06:31:05AM -0700, Paul E. McKenney wrote:
> > > + if (head &&
> > > + ns == rcu_dereference_protected(head->current_path,
> > > + lockdep_is_held(&ns->ctrl->subsys->lock)))
> > > rcu_assign_pointer(head->current_path, NULL);
> > > }
> > > struct nvme_ns *nvme_find_path(struct nvme_ns_head *head);
> >
> > We don't really dereference it at all in fact, but just check the
> > pointers for equality. I wonder if there is a better way to do this,
> > as my ANA patches add a caller without the lock (and withou SRU
> > protection either now that I think of it) - for a pure pointer compare
> > we really should not need any sort of protection.
>
> If you are just looking at the value of an RCU-protected pointer, then
> using rcu_access_pointer() will cause RCU to just read out the value
> and otherwise keeps its mouth shut.
That is exactly the function I was looking for. And given that srcu
and rcu use the same annotations I should have through of being able
to use it of course. As you see above we only use the return value
to do a comparison, so we should be perfectly fine.
Johannes, can you respin the patch to use rcu_access_pointer?
On Mon, May 14, 2018 at 02:57:25PM +0200, Johannes Thumshirn wrote:
> On Mon, May 14, 2018 at 05:42:30AM -0700, Christoph Hellwig wrote:
> > > extern unsigned int nvme_io_timeout;
> > > #define NVME_IO_TIMEOUT (nvme_io_timeout * HZ)
> > > @@ -454,7 +455,9 @@ static inline void nvme_mpath_clear_current_path(struct nvme_ns *ns)
> > > {
> > > struct nvme_ns_head *head = ns->head;
> > >
> > > - if (head && ns == srcu_dereference(head->current_path, &head->srcu))
> > > + if (head &&
> > > + ns == rcu_dereference_protected(head->current_path,
> > > + lockdep_is_held(&ns->ctrl->subsys->lock)))
> > > rcu_assign_pointer(head->current_path, NULL);
> > > }
> > > struct nvme_ns *nvme_find_path(struct nvme_ns_head *head);
> >
> > We don't really dereference it at all in fact, but just check the
> > pointers for equality. I wonder if there is a better way to do this,
> > as my ANA patches add a caller without the lock (and withou SRU
> > protection either now that I think of it) - for a pure pointer compare
> > we really should not need any sort of protection.
>
> Uff maybe, but are you sure a comparison of two pointer is always
> atomic (on all architectures)?
>
> Paul, can you shed some light on us mere mortal, whether the above
> rcu_dereference_protected() is needed or if a simple ns ==
> head->current_path is sufficient.
One approach is the following:
static inline void nvme_mpath_clear_current_path(struct nvme_ns *ns)
{
struct nvme_ns_head *head = ns->head;
if (head && ns == rcu_access_pointer(head->current_path))
rcu_assign_pointer(head->current_path, NULL);
}
Without the rcu_access_pointer(), sparse (and thus the 0-day test robot)
will complain that you are accessing an RCU-protected pointer without
using RCU. However, rcu_access_pointer() won't ever give any lockdep
splats about there being no RCU read-side critical section.
You might still want rcu_dereference_protected() because it will yell
at you if the lock is not held. Yes, the comparison will still be valid
without the lock (at least at the exact moment when the load occurred),
but the rcu_assign_pointer() might be a bit problematic if that lock is
not held, right?
But it is your guys' code, so I must defer to you for the intent.
Thanx, Paul
On Mon, May 14, 2018 at 06:38:49AM -0700, Paul E. McKenney wrote:
> On Mon, May 14, 2018 at 02:57:25PM +0200, Johannes Thumshirn wrote:
> > On Mon, May 14, 2018 at 05:42:30AM -0700, Christoph Hellwig wrote:
> > > > extern unsigned int nvme_io_timeout;
> > > > #define NVME_IO_TIMEOUT (nvme_io_timeout * HZ)
> > > > @@ -454,7 +455,9 @@ static inline void nvme_mpath_clear_current_path(struct nvme_ns *ns)
> > > > {
> > > > struct nvme_ns_head *head = ns->head;
> > > >
> > > > - if (head && ns == srcu_dereference(head->current_path, &head->srcu))
> > > > + if (head &&
> > > > + ns == rcu_dereference_protected(head->current_path,
> > > > + lockdep_is_held(&ns->ctrl->subsys->lock)))
> > > > rcu_assign_pointer(head->current_path, NULL);
> > > > }
> > > > struct nvme_ns *nvme_find_path(struct nvme_ns_head *head);
> > >
> > > We don't really dereference it at all in fact, but just check the
> > > pointers for equality. I wonder if there is a better way to do this,
> > > as my ANA patches add a caller without the lock (and withou SRU
> > > protection either now that I think of it) - for a pure pointer compare
> > > we really should not need any sort of protection.
> >
> > Uff maybe, but are you sure a comparison of two pointer is always
> > atomic (on all architectures)?
> >
> > Paul, can you shed some light on us mere mortal, whether the above
> > rcu_dereference_protected() is needed or if a simple ns ==
> > head->current_path is sufficient.
>
> One approach is the following:
>
> static inline void nvme_mpath_clear_current_path(struct nvme_ns *ns)
> {
> struct nvme_ns_head *head = ns->head;
>
> if (head && ns == rcu_access_pointer(head->current_path))
> rcu_assign_pointer(head->current_path, NULL);
> }
Yes that's what I have now as well, and it tests fine.
>
> Without the rcu_access_pointer(), sparse (and thus the 0-day test robot)
> will complain that you are accessing an RCU-protected pointer without
> using RCU. However, rcu_access_pointer() won't ever give any lockdep
> splats about there being no RCU read-side critical section.
>
> You might still want rcu_dereference_protected() because it will yell
> at you if the lock is not held. Yes, the comparison will still be valid
> without the lock (at least at the exact moment when the load occurred),
> but the rcu_assign_pointer() might be a bit problematic if that lock is
> not held, right?
>
> But it is your guys' code, so I must defer to you for the intent.
>
> Thanx, Paul
>
>
> _______________________________________________
> Linux-nvme mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-nvme
--
Johannes Thumshirn Storage
[email protected] +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: Felix Imend?rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N?rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
On Mon, May 14, 2018 at 03:56:22PM +0200, Johannes Thumshirn wrote:
> On Mon, May 14, 2018 at 06:38:49AM -0700, Paul E. McKenney wrote:
> > On Mon, May 14, 2018 at 02:57:25PM +0200, Johannes Thumshirn wrote:
> > > On Mon, May 14, 2018 at 05:42:30AM -0700, Christoph Hellwig wrote:
> > > > > extern unsigned int nvme_io_timeout;
> > > > > #define NVME_IO_TIMEOUT (nvme_io_timeout * HZ)
> > > > > @@ -454,7 +455,9 @@ static inline void nvme_mpath_clear_current_path(struct nvme_ns *ns)
> > > > > {
> > > > > struct nvme_ns_head *head = ns->head;
> > > > >
> > > > > - if (head && ns == srcu_dereference(head->current_path, &head->srcu))
> > > > > + if (head &&
> > > > > + ns == rcu_dereference_protected(head->current_path,
> > > > > + lockdep_is_held(&ns->ctrl->subsys->lock)))
> > > > > rcu_assign_pointer(head->current_path, NULL);
> > > > > }
> > > > > struct nvme_ns *nvme_find_path(struct nvme_ns_head *head);
> > > >
> > > > We don't really dereference it at all in fact, but just check the
> > > > pointers for equality. I wonder if there is a better way to do this,
> > > > as my ANA patches add a caller without the lock (and withou SRU
> > > > protection either now that I think of it) - for a pure pointer compare
> > > > we really should not need any sort of protection.
> > >
> > > Uff maybe, but are you sure a comparison of two pointer is always
> > > atomic (on all architectures)?
> > >
> > > Paul, can you shed some light on us mere mortal, whether the above
> > > rcu_dereference_protected() is needed or if a simple ns ==
> > > head->current_path is sufficient.
> >
> > One approach is the following:
> >
> > static inline void nvme_mpath_clear_current_path(struct nvme_ns *ns)
> > {
> > struct nvme_ns_head *head = ns->head;
> >
> > if (head && ns == rcu_access_pointer(head->current_path))
> > rcu_assign_pointer(head->current_path, NULL);
> > }
>
> Yes that's what I have now as well, and it tests fine.
Very good! If it turns out to be useful, you can of course directly
use lockdep_assert_held() to verify that the lock is held.
Thanx, Paul
> > Without the rcu_access_pointer(), sparse (and thus the 0-day test robot)
> > will complain that you are accessing an RCU-protected pointer without
> > using RCU. However, rcu_access_pointer() won't ever give any lockdep
> > splats about there being no RCU read-side critical section.
> >
> > You might still want rcu_dereference_protected() because it will yell
> > at you if the lock is not held. Yes, the comparison will still be valid
> > without the lock (at least at the exact moment when the load occurred),
> > but the rcu_assign_pointer() might be a bit problematic if that lock is
> > not held, right?
> >
> > But it is your guys' code, so I must defer to you for the intent.
> >
> > Thanx, Paul
> >
> >
> > _______________________________________________
> > Linux-nvme mailing list
> > [email protected]
> > http://lists.infradead.org/mailman/listinfo/linux-nvme
>
> --
> Johannes Thumshirn Storage
> [email protected] +49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
> GF: Felix Imend?rffer, Jane Smithard, Graham Norton
> HRB 21284 (AG N?rnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
>