2023-04-10 11:43:55

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v5 0/6] md: fix uaf for sync_thread

From: Yu Kuai <[email protected]>

Changes in v5:
- use rcu_dereference_protected() instead of rcu_access_pointer() where
rcu_read_lock/unlock is not required.
- add patch 4,5 to handle that bitmap timeout is set multiple times.

Changes in v4:
- remove patch 2 from v3
- fix sparse errors and warnings from v3, in order to do that, all access
to md_thread need to be modified, patch 2-4 is splited to avoid a huge
patch.

Changes in v3:
- remove patch 3 from v2
- use rcu instead of a new lock

Changes in v2:
- fix a compile error for md-cluster in patch 2
- replace spin_lock/unlock with spin_lock/unlock_irq in patch 5
- don't wake up inside the new lock in md wakeup_thread in patch 5

Yu Kuai (6):
md: pass a md_thread pointer to md_register_thread()
md: factor out a helper to wake up md_thread directly
dm-raid: remove useless checking in raid_message()
md/bitmap: always wake up md_thread in timeout_store
md/bitmap: factor out a helper to set timeout
md: protect md_thread with rcu

drivers/md/dm-raid.c | 4 +-
drivers/md/md-bitmap.c | 50 +++++++++++------
drivers/md/md-cluster.c | 11 ++--
drivers/md/md-multipath.c | 6 +--
drivers/md/md.c | 110 ++++++++++++++++++++------------------
drivers/md/md.h | 15 +++---
drivers/md/raid1.c | 9 ++--
drivers/md/raid1.h | 2 +-
drivers/md/raid10.c | 25 ++++-----
drivers/md/raid10.h | 2 +-
drivers/md/raid5-cache.c | 20 +++----
drivers/md/raid5.c | 19 +++----
drivers/md/raid5.h | 2 +-
13 files changed, 148 insertions(+), 127 deletions(-)

--
2.39.2


2023-04-10 11:44:05

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v5 4/6] md/bitmap: always wake up md_thread in timeout_store

From: Yu Kuai <[email protected]>

md_wakeup_thread() can handle the case that pass in md_thread is NULL,
the only difference is that md_wakeup_thread() will be called when
current timeout is 'MAX_SCHEDULE_TIMEOUT', this should not matter
because timeout_store() is not hot path, and the daemon process is
woke up more than demand from other context already.

This patch prepare to factor out a helper to set timeout.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md-bitmap.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index e7cc6ba1b657..014e5c8a4fe0 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -2452,11 +2452,11 @@ timeout_store(struct mddev *mddev, const char *buf, size_t len)
* the bitmap is all clean and we don't need to
* adjust the timeout right now
*/
- if (mddev->thread->timeout < MAX_SCHEDULE_TIMEOUT) {
+ if (mddev->thread->timeout < MAX_SCHEDULE_TIMEOUT)
mddev->thread->timeout = timeout;
- md_wakeup_thread(mddev->thread);
- }
}
+
+ md_wakeup_thread(mddev->thread);
return len;
}

--
2.39.2

2023-04-10 11:44:08

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v5 3/6] dm-raid: remove useless checking in raid_message()

From: Yu Kuai <[email protected]>

md_wakeup_thread() handle the case that pass in md_thread is NULL, there
is no need to check this. Prepare to protect md_thread with rcu.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/dm-raid.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index 2dfd51509647..9db45f3508e3 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -3750,11 +3750,11 @@ static int raid_message(struct dm_target *ti, unsigned int argc, char **argv,
* canceling read-auto mode
*/
mddev->ro = 0;
- if (!mddev->suspended && mddev->sync_thread)
+ if (!mddev->suspended)
md_wakeup_thread(mddev->sync_thread);
}
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
- if (!mddev->suspended && mddev->thread)
+ if (!mddev->suspended)
md_wakeup_thread(mddev->thread);

return 0;
--
2.39.2

2023-04-10 11:45:29

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v5 5/6] md/bitmap: factor out a helper to set timeout

From: Yu Kuai <[email protected]>

Register/unregister 'mddev->thread' are both under 'reconfig_mutex',
however, some context didn't hold the mutex to access mddev->thread,
which can cause null-ptr-deference:

1) md_bitmap_daemon_work() can be called from md_check_recovery() where
'reconfig_mutex' is not held, deference 'mddev->thread' might cause
null-ptr-deference, because md_unregister_thread() reset the pointer
before stopping the thread.

2) timeout_store() access 'mddev->thread' multiple times,
null-ptr-deference can be triggered if 'mddev->thread' is reset in the
middle.

This patch factor out a helper to set timeout, the new helper always
check if 'mddev->thread' is null first, so that problem 1 can be fixed.

Now that this helper only access 'mddev->thread' once, but it's possible
that 'mddev->thread' is freed while this helper is still in progress,
hence the problem is not fixed yet. Follow up patches will fix this by
protecting md_thread with rcu.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md-bitmap.c | 35 +++++++++++++++++++----------------
1 file changed, 19 insertions(+), 16 deletions(-)

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index 014e5c8a4fe0..29fd41ef55a6 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -1218,11 +1218,22 @@ static bitmap_counter_t *md_bitmap_get_counter(struct bitmap_counts *bitmap,
sector_t offset, sector_t *blocks,
int create);

+static void mddev_set_timeout(struct mddev *mddev, unsigned long timeout,
+ bool force)
+{
+ struct md_thread *thread = mddev->thread;
+
+ if (!thread)
+ return;
+
+ if (force || thread->timeout < MAX_SCHEDULE_TIMEOUT)
+ thread->timeout = timeout;
+}
+
/*
* bitmap daemon -- periodically wakes up to clean bits and flush pages
* out to disk
*/
-
void md_bitmap_daemon_work(struct mddev *mddev)
{
struct bitmap *bitmap;
@@ -1246,7 +1257,7 @@ void md_bitmap_daemon_work(struct mddev *mddev)

bitmap->daemon_lastrun = jiffies;
if (bitmap->allclean) {
- mddev->thread->timeout = MAX_SCHEDULE_TIMEOUT;
+ mddev_set_timeout(mddev, MAX_SCHEDULE_TIMEOUT, true);
goto done;
}
bitmap->allclean = 1;
@@ -1343,8 +1354,7 @@ void md_bitmap_daemon_work(struct mddev *mddev)

done:
if (bitmap->allclean == 0)
- mddev->thread->timeout =
- mddev->bitmap_info.daemon_sleep;
+ mddev_set_timeout(mddev, mddev->bitmap_info.daemon_sleep, true);
mutex_unlock(&mddev->bitmap_info.mutex);
}

@@ -1797,8 +1807,7 @@ void md_bitmap_destroy(struct mddev *mddev)
mddev->bitmap = NULL; /* disconnect from the md device */
spin_unlock(&mddev->lock);
mutex_unlock(&mddev->bitmap_info.mutex);
- if (mddev->thread)
- mddev->thread->timeout = MAX_SCHEDULE_TIMEOUT;
+ mddev_set_timeout(mddev, MAX_SCHEDULE_TIMEOUT, true);

md_bitmap_free(bitmap);
}
@@ -1941,7 +1950,7 @@ int md_bitmap_load(struct mddev *mddev)
/* Kick recovery in case any bits were set */
set_bit(MD_RECOVERY_NEEDED, &bitmap->mddev->recovery);

- mddev->thread->timeout = mddev->bitmap_info.daemon_sleep;
+ mddev_set_timeout(mddev, mddev->bitmap_info.daemon_sleep, true);
md_wakeup_thread(mddev->thread);

md_bitmap_update_sb(bitmap);
@@ -2446,17 +2455,11 @@ timeout_store(struct mddev *mddev, const char *buf, size_t len)
timeout = MAX_SCHEDULE_TIMEOUT-1;
if (timeout < 1)
timeout = 1;
- mddev->bitmap_info.daemon_sleep = timeout;
- if (mddev->thread) {
- /* if thread->timeout is MAX_SCHEDULE_TIMEOUT, then
- * the bitmap is all clean and we don't need to
- * adjust the timeout right now
- */
- if (mddev->thread->timeout < MAX_SCHEDULE_TIMEOUT)
- mddev->thread->timeout = timeout;
- }

+ mddev->bitmap_info.daemon_sleep = timeout;
+ mddev_set_timeout(mddev, timeout, false);
md_wakeup_thread(mddev->thread);
+
return len;
}

--
2.39.2

2023-04-10 11:46:18

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v5 1/6] md: pass a md_thread pointer to md_register_thread()

From: Yu Kuai <[email protected]>

Prepare to protect md_thread with rcu, there are no functional changes.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md-cluster.c | 11 +++++------
drivers/md/md-multipath.c | 6 +++---
drivers/md/md.c | 27 ++++++++++++++-------------
drivers/md/md.h | 7 +++----
drivers/md/raid1.c | 5 ++---
drivers/md/raid10.c | 15 ++++++---------
drivers/md/raid5-cache.c | 5 ++---
drivers/md/raid5.c | 15 ++++++---------
8 files changed, 41 insertions(+), 50 deletions(-)

diff --git a/drivers/md/md-cluster.c b/drivers/md/md-cluster.c
index 10e0c5381d01..c19e29cb73bf 100644
--- a/drivers/md/md-cluster.c
+++ b/drivers/md/md-cluster.c
@@ -362,9 +362,8 @@ static void __recover_slot(struct mddev *mddev, int slot)

set_bit(slot, &cinfo->recovery_map);
if (!cinfo->recovery_thread) {
- cinfo->recovery_thread = md_register_thread(recover_bitmaps,
- mddev, "recover");
- if (!cinfo->recovery_thread) {
+ if (md_register_thread(&cinfo->recovery_thread, recover_bitmaps,
+ mddev, "recover")) {
pr_warn("md-cluster: Could not create recovery thread\n");
return;
}
@@ -888,9 +887,9 @@ static int join(struct mddev *mddev, int nodes)
goto err;
}
/* Initiate the communication resources */
- ret = -ENOMEM;
- cinfo->recv_thread = md_register_thread(recv_daemon, mddev, "cluster_recv");
- if (!cinfo->recv_thread) {
+ ret = md_register_thread(&cinfo->recv_thread, recv_daemon, mddev,
+ "cluster_recv");
+ if (ret) {
pr_err("md-cluster: cannot allocate memory for recv_thread!\n");
goto err;
}
diff --git a/drivers/md/md-multipath.c b/drivers/md/md-multipath.c
index 66edf5e72bd6..ceec9e4b2a60 100644
--- a/drivers/md/md-multipath.c
+++ b/drivers/md/md-multipath.c
@@ -400,9 +400,9 @@ static int multipath_run (struct mddev *mddev)
if (ret)
goto out_free_conf;

- mddev->thread = md_register_thread(multipathd, mddev,
- "multipath");
- if (!mddev->thread)
+ ret = md_register_thread(&mddev->thread, multipathd, mddev,
+ "multipath");
+ if (ret)
goto out_free_conf;

pr_info("multipath: array %s active with %d out of %d IO paths\n",
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 9bc05f451d42..1459c2cfb0dd 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7896,29 +7896,32 @@ void md_wakeup_thread(struct md_thread *thread)
}
EXPORT_SYMBOL(md_wakeup_thread);

-struct md_thread *md_register_thread(void (*run) (struct md_thread *),
- struct mddev *mddev, const char *name)
+int md_register_thread(struct md_thread **threadp,
+ void (*run)(struct md_thread *),
+ struct mddev *mddev, const char *name)
{
struct md_thread *thread;

thread = kzalloc(sizeof(struct md_thread), GFP_KERNEL);
if (!thread)
- return NULL;
+ return -ENOMEM;

init_waitqueue_head(&thread->wqueue);

thread->run = run;
thread->mddev = mddev;
thread->timeout = MAX_SCHEDULE_TIMEOUT;
- thread->tsk = kthread_run(md_thread, thread,
- "%s_%s",
- mdname(thread->mddev),
- name);
+ thread->tsk = kthread_run(md_thread, thread, "%s_%s",
+ mdname(thread->mddev), name);
if (IS_ERR(thread->tsk)) {
+ int err = PTR_ERR(thread->tsk);
+
kfree(thread);
- return NULL;
+ return err;
}
- return thread;
+
+ *threadp = thread;
+ return 0;
}
EXPORT_SYMBOL(md_register_thread);

@@ -9199,10 +9202,8 @@ static void md_start_sync(struct work_struct *ws)
{
struct mddev *mddev = container_of(ws, struct mddev, del_work);

- mddev->sync_thread = md_register_thread(md_do_sync,
- mddev,
- "resync");
- if (!mddev->sync_thread) {
+ if (md_register_thread(&mddev->sync_thread, md_do_sync, mddev,
+ "resync")) {
pr_warn("%s: could not start resync thread...\n",
mdname(mddev));
/* leave the spares where they are, it shouldn't hurt */
diff --git a/drivers/md/md.h b/drivers/md/md.h
index e148e3c83b0d..7af45b8432e9 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -730,10 +730,9 @@ extern int register_md_cluster_operations(struct md_cluster_operations *ops,
extern int unregister_md_cluster_operations(void);
extern int md_setup_cluster(struct mddev *mddev, int nodes);
extern void md_cluster_stop(struct mddev *mddev);
-extern struct md_thread *md_register_thread(
- void (*run)(struct md_thread *thread),
- struct mddev *mddev,
- const char *name);
+extern int md_register_thread(struct md_thread **threadp,
+ void (*run)(struct md_thread *thread),
+ struct mddev *mddev, const char *name);
extern void md_unregister_thread(struct md_thread **threadp);
extern void md_wakeup_thread(struct md_thread *thread);
extern void md_check_recovery(struct mddev *mddev);
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 68a9e2d9985b..1217c1db0a40 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -3083,9 +3083,8 @@ static struct r1conf *setup_conf(struct mddev *mddev)
}
}

- err = -ENOMEM;
- conf->thread = md_register_thread(raid1d, mddev, "raid1");
- if (!conf->thread)
+ err = md_register_thread(&conf->thread, raid1d, mddev, "raid1");
+ if (err)
goto abort;

return conf;
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 6c66357f92f5..0171ba4f19b0 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -4077,9 +4077,8 @@ static struct r10conf *setup_conf(struct mddev *mddev)
init_waitqueue_head(&conf->wait_barrier);
atomic_set(&conf->nr_pending, 0);

- err = -ENOMEM;
- conf->thread = md_register_thread(raid10d, mddev, "raid10");
- if (!conf->thread)
+ err = md_register_thread(&conf->thread, raid10d, mddev, "raid10");
+ if (err)
goto out;

conf->mddev = mddev;
@@ -4273,9 +4272,8 @@ static int raid10_run(struct mddev *mddev)
clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
set_bit(MD_RECOVERY_RESHAPE, &mddev->recovery);
set_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
- mddev->sync_thread = md_register_thread(md_do_sync, mddev,
- "reshape");
- if (!mddev->sync_thread)
+ if (md_register_thread(&mddev->sync_thread, md_do_sync, mddev,
+ "reshape"))
goto out_free_conf;
}

@@ -4686,9 +4684,8 @@ static int raid10_start_reshape(struct mddev *mddev)
set_bit(MD_RECOVERY_RESHAPE, &mddev->recovery);
set_bit(MD_RECOVERY_RUNNING, &mddev->recovery);

- mddev->sync_thread = md_register_thread(md_do_sync, mddev,
- "reshape");
- if (!mddev->sync_thread) {
+ if (md_register_thread(&mddev->sync_thread, md_do_sync, mddev,
+ "reshape")) {
ret = -EAGAIN;
goto abort;
}
diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 46182b955aef..0464d4d551fc 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -3121,9 +3121,8 @@ int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev)
spin_lock_init(&log->tree_lock);
INIT_RADIX_TREE(&log->big_stripe_tree, GFP_NOWAIT | __GFP_NOWARN);

- log->reclaim_thread = md_register_thread(r5l_reclaim_thread,
- log->rdev->mddev, "reclaim");
- if (!log->reclaim_thread)
+ if (md_register_thread(&log->reclaim_thread, r5l_reclaim_thread,
+ log->rdev->mddev, "reclaim"))
goto reclaim_thread;
log->reclaim_thread->timeout = R5C_RECLAIM_WAKEUP_INTERVAL;

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 7b820b81d8c2..04b1093195d0 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7665,11 +7665,10 @@ static struct r5conf *setup_conf(struct mddev *mddev)
}

sprintf(pers_name, "raid%d", mddev->new_level);
- conf->thread = md_register_thread(raid5d, mddev, pers_name);
- if (!conf->thread) {
+ ret = md_register_thread(&conf->thread, raid5d, mddev, pers_name);
+ if (ret) {
pr_warn("md/raid:%s: couldn't allocate thread.\n",
mdname(mddev));
- ret = -ENOMEM;
goto abort;
}

@@ -7989,9 +7988,8 @@ static int raid5_run(struct mddev *mddev)
clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
set_bit(MD_RECOVERY_RESHAPE, &mddev->recovery);
set_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
- mddev->sync_thread = md_register_thread(md_do_sync, mddev,
- "reshape");
- if (!mddev->sync_thread)
+ if (md_register_thread(&mddev->sync_thread, md_do_sync, mddev,
+ "reshape"))
goto abort;
}

@@ -8567,9 +8565,8 @@ static int raid5_start_reshape(struct mddev *mddev)
clear_bit(MD_RECOVERY_DONE, &mddev->recovery);
set_bit(MD_RECOVERY_RESHAPE, &mddev->recovery);
set_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
- mddev->sync_thread = md_register_thread(md_do_sync, mddev,
- "reshape");
- if (!mddev->sync_thread) {
+ if (md_register_thread(&mddev->sync_thread, md_do_sync, mddev,
+ "reshape")) {
mddev->recovery = 0;
spin_lock_irq(&conf->device_lock);
write_seqcount_begin(&conf->gen_lock);
--
2.39.2

2023-04-10 11:47:37

by Yu Kuai

[permalink] [raw]
Subject: [PATCH -next v5 6/6] md: protect md_thread with rcu

From: Yu Kuai <[email protected]>

Our test reports a uaf for 'mddev->sync_thread':

T1 T2
md_start_sync
md_register_thread
// mddev->sync_thread is set
raid1d
md_check_recovery
md_reap_sync_thread
md_unregister_thread
kfree

md_wakeup_thread
wake_up
->sync_thread was freed

Root cause is that there is a small windown between register thread and
wake up thread, where the thread can be freed concurrently.

Currently, a global spinlock 'pers_lock' is borrowed to protect
'mddev->thread', this problem can be fixed likewise, however, there might
be similar problem elsewhere, and use a global lock for all the cases is
not good.

This patch protect md_thread with rcu.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/md-bitmap.c | 29 ++++++++++++-----
drivers/md/md.c | 68 +++++++++++++++++++---------------------
drivers/md/md.h | 10 +++---
drivers/md/raid1.c | 4 +--
drivers/md/raid1.h | 2 +-
drivers/md/raid10.c | 10 ++++--
drivers/md/raid10.h | 2 +-
drivers/md/raid5-cache.c | 15 +++++----
drivers/md/raid5.c | 4 +--
drivers/md/raid5.h | 2 +-
10 files changed, 81 insertions(+), 65 deletions(-)

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index 29fd41ef55a6..b9baeea5605e 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -1219,15 +1219,27 @@ static bitmap_counter_t *md_bitmap_get_counter(struct bitmap_counts *bitmap,
int create);

static void mddev_set_timeout(struct mddev *mddev, unsigned long timeout,
- bool force)
+ bool force, bool protected)
{
- struct md_thread *thread = mddev->thread;
+ struct md_thread *thread;
+
+ if (!protected) {
+ rcu_read_lock();
+ thread = rcu_dereference(mddev->thread);
+ } else {
+ thread = rcu_dereference_protected(mddev->thread,
+ lockdep_is_held(&mddev->reconfig_mutex));
+ }

if (!thread)
- return;
+ goto out;

if (force || thread->timeout < MAX_SCHEDULE_TIMEOUT)
thread->timeout = timeout;
+
+out:
+ if (!protected)
+ rcu_read_unlock();
}

/*
@@ -1257,7 +1269,7 @@ void md_bitmap_daemon_work(struct mddev *mddev)

bitmap->daemon_lastrun = jiffies;
if (bitmap->allclean) {
- mddev_set_timeout(mddev, MAX_SCHEDULE_TIMEOUT, true);
+ mddev_set_timeout(mddev, MAX_SCHEDULE_TIMEOUT, true, false);
goto done;
}
bitmap->allclean = 1;
@@ -1354,7 +1366,8 @@ void md_bitmap_daemon_work(struct mddev *mddev)

done:
if (bitmap->allclean == 0)
- mddev_set_timeout(mddev, mddev->bitmap_info.daemon_sleep, true);
+ mddev_set_timeout(mddev, mddev->bitmap_info.daemon_sleep, true,
+ false);
mutex_unlock(&mddev->bitmap_info.mutex);
}

@@ -1807,7 +1820,7 @@ void md_bitmap_destroy(struct mddev *mddev)
mddev->bitmap = NULL; /* disconnect from the md device */
spin_unlock(&mddev->lock);
mutex_unlock(&mddev->bitmap_info.mutex);
- mddev_set_timeout(mddev, MAX_SCHEDULE_TIMEOUT, true);
+ mddev_set_timeout(mddev, MAX_SCHEDULE_TIMEOUT, true, true);

md_bitmap_free(bitmap);
}
@@ -1950,7 +1963,7 @@ int md_bitmap_load(struct mddev *mddev)
/* Kick recovery in case any bits were set */
set_bit(MD_RECOVERY_NEEDED, &bitmap->mddev->recovery);

- mddev_set_timeout(mddev, mddev->bitmap_info.daemon_sleep, true);
+ mddev_set_timeout(mddev, mddev->bitmap_info.daemon_sleep, true, true);
md_wakeup_thread(mddev->thread);

md_bitmap_update_sb(bitmap);
@@ -2457,7 +2470,7 @@ timeout_store(struct mddev *mddev, const char *buf, size_t len)
timeout = 1;

mddev->bitmap_info.daemon_sleep = timeout;
- mddev_set_timeout(mddev, timeout, false);
+ mddev_set_timeout(mddev, timeout, false, false);
md_wakeup_thread(mddev->thread);

return len;
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 139c7b0202e3..3afece35f0ee 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -70,11 +70,7 @@
#include "md-bitmap.h"
#include "md-cluster.h"

-/* pers_list is a list of registered personalities protected
- * by pers_lock.
- * pers_lock does extra service to protect accesses to
- * mddev->thread when the mutex cannot be held.
- */
+/* pers_list is a list of registered personalities protected by pers_lock. */
static LIST_HEAD(pers_list);
static DEFINE_SPINLOCK(pers_lock);

@@ -92,7 +88,7 @@ static struct workqueue_struct *md_rdev_misc_wq;
static int remove_and_add_spares(struct mddev *mddev,
struct md_rdev *this);
static void mddev_detach(struct mddev *mddev);
-static void md_wakeup_thread_directly(struct md_thread *thread);
+static void md_wakeup_thread_directly(struct md_thread __rcu *thread);

enum md_ro_state {
MD_RDWR,
@@ -458,8 +454,10 @@ static void md_submit_bio(struct bio *bio)
*/
void mddev_suspend(struct mddev *mddev)
{
- WARN_ON_ONCE(mddev->thread && current == mddev->thread->tsk);
- lockdep_assert_held(&mddev->reconfig_mutex);
+ struct md_thread *thread = rcu_dereference_protected(mddev->thread,
+ lockdep_is_held(&mddev->reconfig_mutex));
+
+ WARN_ON_ONCE(thread && current == thread->tsk);
if (mddev->suspended++)
return;
wake_up(&mddev->sb_wait);
@@ -801,13 +799,8 @@ void mddev_unlock(struct mddev *mddev)
} else
mutex_unlock(&mddev->reconfig_mutex);

- /* As we've dropped the mutex we need a spinlock to
- * make sure the thread doesn't disappear
- */
- spin_lock(&pers_lock);
md_wakeup_thread(mddev->thread);
wake_up(&mddev->sb_wait);
- spin_unlock(&pers_lock);
}
EXPORT_SYMBOL_GPL(mddev_unlock);

@@ -7891,23 +7884,33 @@ static int md_thread(void *arg)
return 0;
}

-static void md_wakeup_thread_directly(struct md_thread *thread)
+static void md_wakeup_thread_directly(struct md_thread __rcu *thread)
{
- if (thread)
- wake_up_process(thread->tsk);
+ struct md_thread *t;
+
+ rcu_read_lock();
+ t = rcu_dereference(thread);
+ if (t)
+ wake_up_process(t->tsk);
+ rcu_read_unlock();
}

-void md_wakeup_thread(struct md_thread *thread)
+void md_wakeup_thread(struct md_thread __rcu *thread)
{
- if (thread) {
- pr_debug("md: waking up MD thread %s.\n", thread->tsk->comm);
- set_bit(THREAD_WAKEUP, &thread->flags);
- wake_up(&thread->wqueue);
+ struct md_thread *t;
+
+ rcu_read_lock();
+ t = rcu_dereference(thread);
+ if (t) {
+ pr_debug("md: waking up MD thread %s.\n", t->tsk->comm);
+ set_bit(THREAD_WAKEUP, &t->flags);
+ wake_up(&t->wqueue);
}
+ rcu_read_unlock();
}
EXPORT_SYMBOL(md_wakeup_thread);

-int md_register_thread(struct md_thread **threadp,
+int md_register_thread(struct md_thread __rcu **threadp,
void (*run)(struct md_thread *),
struct mddev *mddev, const char *name)
{
@@ -7931,27 +7934,20 @@ int md_register_thread(struct md_thread **threadp,
return err;
}

- *threadp = thread;
+ rcu_assign_pointer(*threadp, thread);
return 0;
}
EXPORT_SYMBOL(md_register_thread);

-void md_unregister_thread(struct md_thread **threadp)
+void md_unregister_thread(struct md_thread __rcu **threadp)
{
- struct md_thread *thread;
+ struct md_thread *thread = rcu_dereference_protected(*threadp, true);

- /*
- * Locking ensures that mddev_unlock does not wake_up a
- * non-existent thread
- */
- spin_lock(&pers_lock);
- thread = *threadp;
- if (!thread) {
- spin_unlock(&pers_lock);
+ if (!thread)
return;
- }
- *threadp = NULL;
- spin_unlock(&pers_lock);
+
+ rcu_assign_pointer(*threadp, NULL);
+ synchronize_rcu();

pr_debug("interrupting MD-thread pid %d\n", task_pid_nr(thread->tsk));
kthread_stop(thread->tsk);
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 7af45b8432e9..621a3f183afd 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -367,8 +367,8 @@ struct mddev {
int new_chunk_sectors;
int reshape_backwards;

- struct md_thread *thread; /* management thread */
- struct md_thread *sync_thread; /* doing resync or reconstruct */
+ struct md_thread __rcu *thread; /* management thread */
+ struct md_thread __rcu *sync_thread; /* doing resync or reconstruct */

/* 'last_sync_action' is initialized to "none". It is set when a
* sync operation (i.e "data-check", "requested-resync", "resync",
@@ -730,11 +730,11 @@ extern int register_md_cluster_operations(struct md_cluster_operations *ops,
extern int unregister_md_cluster_operations(void);
extern int md_setup_cluster(struct mddev *mddev, int nodes);
extern void md_cluster_stop(struct mddev *mddev);
-extern int md_register_thread(struct md_thread **threadp,
+extern int md_register_thread(struct md_thread __rcu **threadp,
void (*run)(struct md_thread *thread),
struct mddev *mddev, const char *name);
-extern void md_unregister_thread(struct md_thread **threadp);
-extern void md_wakeup_thread(struct md_thread *thread);
+extern void md_unregister_thread(struct md_thread __rcu **threadp);
+extern void md_wakeup_thread(struct md_thread __rcu *thread);
extern void md_check_recovery(struct mddev *mddev);
extern void md_reap_sync_thread(struct mddev *mddev);
extern int mddev_init_writes_pending(struct mddev *mddev);
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 1217c1db0a40..64f019e3615f 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -3176,8 +3176,8 @@ static int raid1_run(struct mddev *mddev)
/*
* Ok, everything is just fine now
*/
- mddev->thread = conf->thread;
- conf->thread = NULL;
+ rcu_assign_pointer(mddev->thread, conf->thread);
+ rcu_assign_pointer(conf->thread, NULL);
mddev->private = conf;
set_bit(MD_FAILFAST_SUPPORTED, &mddev->flags);

diff --git a/drivers/md/raid1.h b/drivers/md/raid1.h
index ebb6788820e7..468f189da7a0 100644
--- a/drivers/md/raid1.h
+++ b/drivers/md/raid1.h
@@ -130,7 +130,7 @@ struct r1conf {
/* When taking over an array from a different personality, we store
* the new thread here until we fully activate the array.
*/
- struct md_thread *thread;
+ struct md_thread __rcu *thread;

/* Keep track of cluster resync window to send to other
* nodes.
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 0171ba4f19b0..8fa4e61c3f79 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -980,6 +980,7 @@ static void lower_barrier(struct r10conf *conf)
static bool stop_waiting_barrier(struct r10conf *conf)
{
struct bio_list *bio_list = current->bio_list;
+ struct md_thread *thread;

/* barrier is dropped */
if (!conf->barrier)
@@ -995,8 +996,11 @@ static bool stop_waiting_barrier(struct r10conf *conf)
(!bio_list_empty(&bio_list[0]) || !bio_list_empty(&bio_list[1])))
return true;

+ /* daemon thread must exist while handling io */
+ thread = rcu_dereference_protected(conf->mddev->thread, true);
+
/* move on if recovery thread is blocked by us */
- if (conf->mddev->thread->tsk == current &&
+ if (thread->tsk == current &&
test_bit(MD_RECOVERY_RUNNING, &conf->mddev->recovery) &&
conf->nr_queued > 0)
return true;
@@ -4140,8 +4144,8 @@ static int raid10_run(struct mddev *mddev)
}
}

- mddev->thread = conf->thread;
- conf->thread = NULL;
+ rcu_assign_pointer(mddev->thread, conf->thread);
+ rcu_assign_pointer(conf->thread, NULL);

if (mddev->queue) {
blk_queue_max_write_zeroes_sectors(mddev->queue, 0);
diff --git a/drivers/md/raid10.h b/drivers/md/raid10.h
index 8c072ce0bc54..63e48b11b552 100644
--- a/drivers/md/raid10.h
+++ b/drivers/md/raid10.h
@@ -100,7 +100,7 @@ struct r10conf {
/* When taking over an array from a different personality, we store
* the new thread here until we fully activate the array.
*/
- struct md_thread *thread;
+ struct md_thread __rcu *thread;

/*
* Keep track of cluster resync window to send to other nodes.
diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 0464d4d551fc..68c4d3a1fd25 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -120,7 +120,7 @@ struct r5l_log {
struct bio_set bs;
mempool_t meta_pool;

- struct md_thread *reclaim_thread;
+ struct md_thread __rcu *reclaim_thread;
unsigned long reclaim_target; /* number of space that need to be
* reclaimed. if it's 0, reclaim spaces
* used by io_units which are in
@@ -1576,17 +1576,18 @@ void r5l_wake_reclaim(struct r5l_log *log, sector_t space)

void r5l_quiesce(struct r5l_log *log, int quiesce)
{
- struct mddev *mddev;
+ struct mddev *mddev = log->rdev->mddev;
+ struct md_thread *thread = rcu_dereference_protected(
+ log->reclaim_thread, lockdep_is_held(&mddev->reconfig_mutex));

if (quiesce) {
/* make sure r5l_write_super_and_discard_space exits */
- mddev = log->rdev->mddev;
wake_up(&mddev->sb_wait);
- kthread_park(log->reclaim_thread->tsk);
+ kthread_park(thread->tsk);
r5l_wake_reclaim(log, MaxSector);
r5l_do_reclaim(log);
} else
- kthread_unpark(log->reclaim_thread->tsk);
+ kthread_unpark(thread->tsk);
}

bool r5l_log_disk_error(struct r5conf *conf)
@@ -3124,7 +3125,9 @@ int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev)
if (md_register_thread(&log->reclaim_thread, r5l_reclaim_thread,
log->rdev->mddev, "reclaim"))
goto reclaim_thread;
- log->reclaim_thread->timeout = R5C_RECLAIM_WAKEUP_INTERVAL;
+
+ rcu_dereference_protected(log->reclaim_thread, true)->timeout =
+ R5C_RECLAIM_WAKEUP_INTERVAL;

init_waitqueue_head(&log->iounit_wait);

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 04b1093195d0..a7e47c37daf3 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7888,8 +7888,8 @@ static int raid5_run(struct mddev *mddev)
}

conf->min_offset_diff = min_offset_diff;
- mddev->thread = conf->thread;
- conf->thread = NULL;
+ rcu_assign_pointer(mddev->thread, conf->thread);
+ rcu_assign_pointer(conf->thread, NULL);
mddev->private = conf;

for (i = 0; i < conf->raid_disks && conf->previous_raid_disks;
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index e873938a6125..f19707189a7b 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -679,7 +679,7 @@ struct r5conf {
/* When taking over an array from a different personality, we store
* the new thread here until we fully activate the array.
*/
- struct md_thread *thread;
+ struct md_thread __rcu *thread;
struct list_head temp_inactive_list[NR_STRIPE_HASH_LOCKS];
struct r5worker_group *worker_groups;
int group_cnt;
--
2.39.2

2023-04-10 15:43:22

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH -next v5 6/6] md: protect md_thread with rcu



On 2023-04-10 05:35, Yu Kuai wrote:
> From: Yu Kuai <[email protected]>
>
> Our test reports a uaf for 'mddev->sync_thread':
>
> T1 T2
> md_start_sync
> md_register_thread
> // mddev->sync_thread is set
> raid1d
> md_check_recovery
> md_reap_sync_thread
> md_unregister_thread
> kfree
>
> md_wakeup_thread
> wake_up
> ->sync_thread was freed
>
> Root cause is that there is a small windown between register thread and
> wake up thread, where the thread can be freed concurrently.
>
> Currently, a global spinlock 'pers_lock' is borrowed to protect
> 'mddev->thread', this problem can be fixed likewise, however, there might
> be similar problem elsewhere, and use a global lock for all the cases is
> not good.
>
> This patch protect md_thread with rcu.
>
> Signed-off-by: Yu Kuai <[email protected]>
> ---
> drivers/md/md-bitmap.c | 29 ++++++++++++-----
> drivers/md/md.c | 68 +++++++++++++++++++---------------------
> drivers/md/md.h | 10 +++---
> drivers/md/raid1.c | 4 +--
> drivers/md/raid1.h | 2 +-
> drivers/md/raid10.c | 10 ++++--
> drivers/md/raid10.h | 2 +-
> drivers/md/raid5-cache.c | 15 +++++----
> drivers/md/raid5.c | 4 +--
> drivers/md/raid5.h | 2 +-
> 10 files changed, 81 insertions(+), 65 deletions(-)
>
> diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
> index 29fd41ef55a6..b9baeea5605e 100644
> --- a/drivers/md/md-bitmap.c
> +++ b/drivers/md/md-bitmap.c
> @@ -1219,15 +1219,27 @@ static bitmap_counter_t *md_bitmap_get_counter(struct bitmap_counts *bitmap,
> int create);
>
> static void mddev_set_timeout(struct mddev *mddev, unsigned long timeout,
> - bool force)
> + bool force, bool protected)
> {
> - struct md_thread *thread = mddev->thread;
> + struct md_thread *thread;
> +
> + if (!protected) {
> + rcu_read_lock();
> + thread = rcu_dereference(mddev->thread);
> + } else {
> + thread = rcu_dereference_protected(mddev->thread,
> + lockdep_is_held(&mddev->reconfig_mutex));
> + }

Why not just always use rcu_read_lock()? Even if it's safe with
reconfig_mutex, it wouldn't harm much and would make the code a bit less
ugly.


> @@ -458,8 +454,10 @@ static void md_submit_bio(struct bio *bio)
> */
> void mddev_suspend(struct mddev *mddev)
> {
> - WARN_ON_ONCE(mddev->thread && current == mddev->thread->tsk);
> - lockdep_assert_held(&mddev->reconfig_mutex);
> + struct md_thread *thread = rcu_dereference_protected(mddev->thread,
> + lockdep_is_held(&mddev->reconfig_mutex));

Do we know that reconfig_mutex is always held when we call
md_unregister_thread()? Seems plausible, but maybe it's worth adding a
lockdep_assert_held() to md_unregsiter_thread().

Thanks,

Logan

2023-04-11 01:08:37

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v5 6/6] md: protect md_thread with rcu

Hi,

在 2023/04/10 23:42, Logan Gunthorpe 写道:
>
>
> On 2023-04-10 05:35, Yu Kuai wrote:
>> From: Yu Kuai <[email protected]>
>>
>> Our test reports a uaf for 'mddev->sync_thread':
>>
>> T1 T2
>> md_start_sync
>> md_register_thread
>> // mddev->sync_thread is set
>> raid1d
>> md_check_recovery
>> md_reap_sync_thread
>> md_unregister_thread
>> kfree
>>
>> md_wakeup_thread
>> wake_up
>> ->sync_thread was freed
>>
>> Root cause is that there is a small windown between register thread and
>> wake up thread, where the thread can be freed concurrently.
>>
>> Currently, a global spinlock 'pers_lock' is borrowed to protect
>> 'mddev->thread', this problem can be fixed likewise, however, there might
>> be similar problem elsewhere, and use a global lock for all the cases is
>> not good.
>>
>> This patch protect md_thread with rcu.
>>
>> Signed-off-by: Yu Kuai <[email protected]>
>> ---
>> drivers/md/md-bitmap.c | 29 ++++++++++++-----
>> drivers/md/md.c | 68 +++++++++++++++++++---------------------
>> drivers/md/md.h | 10 +++---
>> drivers/md/raid1.c | 4 +--
>> drivers/md/raid1.h | 2 +-
>> drivers/md/raid10.c | 10 ++++--
>> drivers/md/raid10.h | 2 +-
>> drivers/md/raid5-cache.c | 15 +++++----
>> drivers/md/raid5.c | 4 +--
>> drivers/md/raid5.h | 2 +-
>> 10 files changed, 81 insertions(+), 65 deletions(-)
>>
>> diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
>> index 29fd41ef55a6..b9baeea5605e 100644
>> --- a/drivers/md/md-bitmap.c
>> +++ b/drivers/md/md-bitmap.c
>> @@ -1219,15 +1219,27 @@ static bitmap_counter_t *md_bitmap_get_counter(struct bitmap_counts *bitmap,
>> int create);
>>
>> static void mddev_set_timeout(struct mddev *mddev, unsigned long timeout,
>> - bool force)
>> + bool force, bool protected)
>> {
>> - struct md_thread *thread = mddev->thread;
>> + struct md_thread *thread;
>> +
>> + if (!protected) {
>> + rcu_read_lock();
>> + thread = rcu_dereference(mddev->thread);
>> + } else {
>> + thread = rcu_dereference_protected(mddev->thread,
>> + lockdep_is_held(&mddev->reconfig_mutex));
>> + }
>
> Why not just always use rcu_read_lock()? Even if it's safe with
> reconfig_mutex, it wouldn't harm much and would make the code a bit less
> ugly.
>
Of course, I'll do that in next version.

>
>> @@ -458,8 +454,10 @@ static void md_submit_bio(struct bio *bio)
>> */
>> void mddev_suspend(struct mddev *mddev)
>> {
>> - WARN_ON_ONCE(mddev->thread && current == mddev->thread->tsk);
>> - lockdep_assert_held(&mddev->reconfig_mutex);
>> + struct md_thread *thread = rcu_dereference_protected(mddev->thread,
>> + lockdep_is_held(&mddev->reconfig_mutex));
>
> Do we know that reconfig_mutex is always held when we call
> md_unregister_thread()? Seems plausible, but maybe it's worth adding a
> lockdep_assert_held() to md_unregsiter_thread().

Unfortunally this is not true for now, md_unregister_thread() can be
called without this mutex from action_store(), and this is problematic,
I'm tring to revert this change in the other thread:

md: fix that MD_RECOVERY_RUNNING can be cleared while sync_thread is
still running.

I think it's not good to add lockdep_assert_held() for now...

Thanks,
Kuai
>
> Thanks,
>
> Logan
> .
>

2023-04-11 01:20:03

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH -next v5 1/6] md: pass a md_thread pointer to md_register_thread()

On Mon, Apr 10, 2023 at 4:37 AM Yu Kuai <[email protected]> wrote:
>
> From: Yu Kuai <[email protected]>
>
> Prepare to protect md_thread with rcu, there are no functional changes.

Why do we need this change? To add __rcu later?

Can we do something like:

struct md_thread __rcu *md_register_thread(void (*run) (struct md_thread *),
struct mddev *mddev, const char *name)

Thanks,
Song

2023-04-11 01:47:36

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v5 1/6] md: pass a md_thread pointer to md_register_thread()

Hi,

在 2023/04/11 9:15, Song Liu 写道:
> On Mon, Apr 10, 2023 at 4:37 AM Yu Kuai <[email protected]> wrote:
>>
>> From: Yu Kuai <[email protected]>
>>
>> Prepare to protect md_thread with rcu, there are no functional changes.
>
> Why do we need this change? To add __rcu later?

Add __rcu is one reason, more importantly is to assign md_thread inside
md_register_thread in patch 6:

rcu_assign_pointer(*threadp, thread);

>
> Can we do something like:
>
> struct md_thread __rcu *md_register_thread(void (*run) (struct md_thread *),
> struct mddev *mddev, const char *name)

I think this is not necessary, if we don't want to change api, we must
use rcu_assign_pointer for each caller to set md_thread.

Thanks,
Kuai
>
> Thanks,
> Song
> .
>

2023-04-11 05:37:16

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH -next v5 1/6] md: pass a md_thread pointer to md_register_thread()

On Mon, Apr 10, 2023 at 6:34 PM Yu Kuai <[email protected]> wrote:
>
> Hi,
>
> 在 2023/04/11 9:15, Song Liu 写道:
> > On Mon, Apr 10, 2023 at 4:37 AM Yu Kuai <[email protected]> wrote:
> >>
> >> From: Yu Kuai <[email protected]>
> >>
> >> Prepare to protect md_thread with rcu, there are no functional changes.
> >
> > Why do we need this change? To add __rcu later?
>
> Add __rcu is one reason, more importantly is to assign md_thread inside
> md_register_thread in patch 6:
>
> rcu_assign_pointer(*threadp, thread);

Got it.

> >
> > Can we do something like:
> >
> > struct md_thread __rcu *md_register_thread(void (*run) (struct md_thread *),
> > struct mddev *mddev, const char *name)
>
> I think this is not necessary, if we don't want to change api, we must
> use rcu_assign_pointer for each caller to set md_thread.

I think it is better to use rcu_assign_pointer at the caller side.

Thanks,
Song

2023-04-11 15:51:08

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH -next v5 1/6] md: pass a md_thread pointer to md_register_thread()



On 2023-04-10 23:26, Song Liu wrote:
> On Mon, Apr 10, 2023 at 6:34 PM Yu Kuai <[email protected]> wrote:
>>
>> Hi,
>>
>> 在 2023/04/11 9:15, Song Liu 写道:
>>> On Mon, Apr 10, 2023 at 4:37 AM Yu Kuai <[email protected]> wrote:
>>>>
>>>> From: Yu Kuai <[email protected]>
>>>>
>>>> Prepare to protect md_thread with rcu, there are no functional changes.
>>>
>>> Why do we need this change? To add __rcu later?
>>
>> Add __rcu is one reason, more importantly is to assign md_thread inside
>> md_register_thread in patch 6:
>>
>> rcu_assign_pointer(*threadp, thread);
>
> Got it.
>
>>>
>>> Can we do something like:
>>>
>>> struct md_thread __rcu *md_register_thread(void (*run) (struct md_thread *),
>>> struct mddev *mddev, const char *name)
>>
>> I think this is not necessary, if we don't want to change api, we must
>> use rcu_assign_pointer for each caller to set md_thread.
>
> I think it is better to use rcu_assign_pointer at the caller side.

I agree.

Logan

2023-04-12 09:27:14

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH -next v5 1/6] md: pass a md_thread pointer to md_register_thread()

Hi,

在 2023/04/11 23:46, Logan Gunthorpe 写道:

>>>> Can we do something like:
>>>>
>>>> struct md_thread __rcu *md_register_thread(void (*run) (struct md_thread *),
>>>> struct mddev *mddev, const char *name)
>>>
>>> I think this is not necessary, if we don't want to change api, we must
>>> use rcu_assign_pointer for each caller to set md_thread.
>>
>> I think it is better to use rcu_assign_pointer at the caller side.
>
> I agree.
>
> Logan
> .
>
I'll remove this patch, and use rcu_assign_pointer() in the caller
directly in the next version.

Thanks,
Kuai