Hello!
This post-RFC series shrinks the srcu_struct structure to the bare minimum
required to support SRCU readers, relegating the remaining fields to a new
srcu_usage structure. Statically allocated srcu_struct structures created
by DEFINE_SRCU() and DEFINE_STATIC_SRCU() have statically allocated
srcu_usage structures, but those required for dynamically allocated
srcu_struct structures that are initialized using init_srcu_struct()
are dynamically allocated.
The results is a reduction in the size of an srcu_struct structure from
a couple hundred bytes to just 24 bytes on x86_64 systems. This can be
helpful when SRCU readers are used in a fastpath for which the srcu_struct
structure must be embedded in another structure, and especially where
that fastpath also needs to access fields both before and after the
srcu_struct structure.
This series takes baby steps, in part because breaking SRCU means that
you get absolutely no console output. Yes, I did learn this the hard way.
Why do you ask? ;-)
Here are those baby steps:
1. rcu-tasks: Fix warning for unused tasks_rcu_exit_srcu.
2. Add whitespace to __SRCU_STRUCT_INIT() & __DEFINE_SRCU().
3. Use static init for statically allocated in-module srcu_struct.
4. Begin offloading srcu_struct fields to srcu_update.
5. Move ->level from srcu_struct to srcu_usage.
6. Move ->srcu_size_state from srcu_struct to srcu_usage.
7. Move ->srcu_cb_mutex from srcu_struct to srcu_usage.
8. Move ->lock initialization after srcu_usage allocation.
9. Move ->lock from srcu_struct to srcu_usage.
10. Move ->srcu_gp_mutex from srcu_struct to srcu_usage.
11. Move grace-period fields from srcu_struct to srcu_usage.
12. Move heuristics fields from srcu_struct to srcu_usage.
13. Move ->sda_is_static from srcu_struct to srcu_usage.
14. Move srcu_barrier() fields from srcu_struct to srcu_usage.
15. Move work-scheduling fields from srcu_struct to srcu_usage.
16. Check for readers at module-exit time.
17. Fix long lines in srcu_get_delay().
18. Fix long lines in cleanup_srcu_struct().
19. Fix long lines in srcu_gp_end().
20. Fix long lines in srcu_funnel_gp_start().
Changes since the RFC series:
https://lore.kernel.org/all/3db82572-f156-4a5d-b711-841aa28bd996@paulmck-laptop/
1. Add checks for readers of in-module statically allocated
srcu_struct structures persisting past module unload.
2. Apply Tested-by tags.
3. Apply feedback from "Zhang, Qiang1" and kernel test robot,
perhaps most notably getting rid of memory leaks and improving
the handling of statically allocated srcu_struct structures
defined within modules.
4. Drop the commit removing extraneous parentheses given the desire
to push this into the v6.4 merge window, the fact that this
commit generates conflicts with other v6.4 RCU commits, and the
low value of this commit. It therefore remains in the v6.5 pile.
Thanx, Paul
------------------------------------------------------------------------
b/include/linux/notifier.h | 5
b/include/linux/srcutiny.h | 6
b/include/linux/srcutree.h | 28 +-
b/kernel/rcu/rcu.h | 6
b/kernel/rcu/srcutree.c | 19 +
b/kernel/rcu/tasks.h | 2
include/linux/srcutree.h | 123 ++++++-----
kernel/rcu/srcutree.c | 495 +++++++++++++++++++++++----------------------
8 files changed, 370 insertions(+), 314 deletions(-)
The tasks_rcu_exit_srcu variable is used only by kernels built
with CONFIG_TASKS_RCU=y, but is defined for all kernesl with
CONFIG_TASKS_RCU_GENERIC=y. Therefore, in kernels built with
CONFIG_TASKS_RCU_GENERIC=y but CONFIG_TASKS_RCU=n, this gives
a "defined but not used" warning.
This commit therefore moves this variable under CONFIG_TASKS_RCU.
Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/tasks.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index bfb5e1549f2b..185358c2f08d 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -136,8 +136,10 @@ static struct rcu_tasks rt_name = \
.kname = #rt_name, \
}
+#ifdef CONFIG_TASKS_RCU
/* Track exiting tasks in order to allow them to be waited for. */
DEFINE_STATIC_SRCU(tasks_rcu_exit_srcu);
+#endif
/* Avoid IPIing CPUs early in the grace period. */
#define RCU_TASK_IPI_DELAY (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB) ? HZ / 2 : 0)
--
2.40.0.rc2
Further shrinking the srcu_struct structure is eased by requiring
that in-module srcu_struct structures rely more heavily on static
initialization. In particular, this preserves the property that
a module-load-time srcu_struct initialization can fail only due
to memory-allocation failure of the per-CPU srcu_data structures.
It might also slightly improve robustness by keeping the number of memory
allocations that must succeed down percpu_alloc() call.
This is in preparation for splitting an srcu_usage structure out
of the srcu_struct structure.
[ paulmck: Fold in [email protected] feedback. ]
Cc: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
include/linux/srcutree.h | 19 ++++++++++++++-----
kernel/rcu/srcutree.c | 19 +++++++++++++------
2 files changed, 27 insertions(+), 11 deletions(-)
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 488d0e5d1ba3..3ce6deee1dbe 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -108,15 +108,24 @@ struct srcu_struct {
#define SRCU_STATE_SCAN1 1
#define SRCU_STATE_SCAN2 2
-#define __SRCU_STRUCT_INIT(name, pcpu_name) \
-{ \
- .sda = &pcpu_name, \
+#define __SRCU_STRUCT_INIT_COMMON(name) \
.lock = __SPIN_LOCK_UNLOCKED(name.lock), \
.srcu_gp_seq_needed = -1UL, \
.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0), \
- __SRCU_DEP_MAP_INIT(name) \
+ __SRCU_DEP_MAP_INIT(name)
+
+#define __SRCU_STRUCT_INIT_MODULE(name) \
+{ \
+ __SRCU_STRUCT_INIT_COMMON(name) \
}
+#define __SRCU_STRUCT_INIT(name, pcpu_name) \
+{ \
+ .sda = &pcpu_name, \
+ __SRCU_STRUCT_INIT_COMMON(name) \
+}
+
+
/*
* Define and initialize a srcu struct at build time.
* Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
@@ -138,7 +147,7 @@ struct srcu_struct {
*/
#ifdef MODULE
# define __DEFINE_SRCU(name, is_static) \
- is_static struct srcu_struct name; \
+ is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name); \
extern struct srcu_struct * const __srcu_struct_##name; \
struct srcu_struct * const __srcu_struct_##name \
__section("___srcu_struct_ptrs") = &name
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index ab4ee58af84b..7e6e7dfb1a87 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1873,13 +1873,14 @@ void __init srcu_init(void)
static int srcu_module_coming(struct module *mod)
{
int i;
+ struct srcu_struct *ssp;
struct srcu_struct **sspp = mod->srcu_struct_ptrs;
- int ret;
for (i = 0; i < mod->num_srcu_structs; i++) {
- ret = init_srcu_struct(*(sspp++));
- if (WARN_ON_ONCE(ret))
- return ret;
+ ssp = *(sspp++);
+ ssp->sda = alloc_percpu(struct srcu_data);
+ if (WARN_ON_ONCE(!ssp->sda))
+ return -ENOMEM;
}
return 0;
}
@@ -1888,10 +1889,16 @@ static int srcu_module_coming(struct module *mod)
static void srcu_module_going(struct module *mod)
{
int i;
+ struct srcu_struct *ssp;
struct srcu_struct **sspp = mod->srcu_struct_ptrs;
- for (i = 0; i < mod->num_srcu_structs; i++)
- cleanup_srcu_struct(*(sspp++));
+ for (i = 0; i < mod->num_srcu_structs; i++) {
+ ssp = *(sspp++);
+ if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_gp_seq_needed)) &&
+ !WARN_ON_ONCE(!ssp->sda_is_static))
+ cleanup_srcu_struct(ssp);
+ free_percpu(ssp->sda);
+ }
}
/* Handle one module, either coming or going. */
--
2.40.0.rc2
This commit moves the ->srcu_cb_mutex field from the srcu_struct structure
to the srcu_usage structure to reduce the size of the former in order
to improve cache locality.
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
include/linux/srcutree.h | 2 +-
kernel/rcu/srcutree.c | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 443d27a214ef..231de66ceb15 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -65,13 +65,13 @@ struct srcu_usage {
struct srcu_node *level[RCU_NUM_LVLS + 1];
/* First node at each level. */
int srcu_size_state; /* Small-to-big transition state. */
+ struct mutex srcu_cb_mutex; /* Serialize CB preparation. */
};
/*
* Per-SRCU-domain structure, similar in function to rcu_state.
*/
struct srcu_struct {
- struct mutex srcu_cb_mutex; /* Serialize CB preparation. */
spinlock_t __private lock; /* Protect counters and size state. */
struct mutex srcu_gp_mutex; /* Serialize GP work. */
unsigned int srcu_idx; /* Current rdr array element. */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 8428a184d506..1814f3bfc219 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -242,7 +242,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
return -ENOMEM;
ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
ssp->srcu_sup->node = NULL;
- mutex_init(&ssp->srcu_cb_mutex);
+ mutex_init(&ssp->srcu_sup->srcu_cb_mutex);
mutex_init(&ssp->srcu_gp_mutex);
ssp->srcu_idx = 0;
ssp->srcu_gp_seq = 0;
@@ -861,7 +861,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
int ss_state;
/* Prevent more than one additional grace period. */
- mutex_lock(&ssp->srcu_cb_mutex);
+ mutex_lock(&ssp->srcu_sup->srcu_cb_mutex);
/* End the current grace period. */
spin_lock_irq_rcu_node(ssp);
@@ -921,7 +921,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
}
/* Callback initiation done, allow grace periods after next. */
- mutex_unlock(&ssp->srcu_cb_mutex);
+ mutex_unlock(&ssp->srcu_sup->srcu_cb_mutex);
/* Start a new grace period if needed. */
spin_lock_irq_rcu_node(ssp);
--
2.40.0.rc2
The current srcu_struct structure is on the order of 200 bytes in size
(depending on architecture and .config), which is much better than the
old-style 26K bytes, but still all too inconvenient when one is trying
to achieve good cache locality on a fastpath involving SRCU readers.
However, only a few fields in srcu_struct are used by SRCU readers.
The remaining fields could be offloaded to a new srcu_update
structure, thus shrinking the srcu_struct structure down to a few
tens of bytes. This commit begins this noble quest, a quest that is
complicated by open-coded initialization of the srcu_struct within the
srcu_notifier_head structure. This complication is addressed by updating
the srcu_notifier_head structure's open coding, given that there does
not appear to be a straightforward way of abstracting that initialization.
This commit moves only the ->node pointer to srcu_update. Later commits
will move additional fields.
[ paulmck: Fold in [email protected]'s memory-leak fix. ]
Link: https://lore.kernel.org/all/[email protected]/
Suggested-by: Christoph Hellwig <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: "Michał Mirosław" <[email protected]>
Cc: Dmitry Osipenko <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
include/linux/notifier.h | 5 ++++-
include/linux/srcutiny.h | 6 +++---
include/linux/srcutree.h | 27 ++++++++++++++++++---------
kernel/rcu/rcu.h | 6 ++++--
kernel/rcu/srcutree.c | 28 +++++++++++++++++++---------
5 files changed, 48 insertions(+), 24 deletions(-)
diff --git a/include/linux/notifier.h b/include/linux/notifier.h
index aef88c2d1173..2aba75145144 100644
--- a/include/linux/notifier.h
+++ b/include/linux/notifier.h
@@ -73,6 +73,9 @@ struct raw_notifier_head {
struct srcu_notifier_head {
struct mutex mutex;
+#ifdef CONFIG_TREE_SRCU
+ struct srcu_usage srcuu;
+#endif
struct srcu_struct srcu;
struct notifier_block __rcu *head;
};
@@ -107,7 +110,7 @@ extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
{ \
.mutex = __MUTEX_INITIALIZER(name.mutex), \
.head = NULL, \
- .srcu = __SRCU_STRUCT_INIT(name.srcu, pcpu), \
+ .srcu = __SRCU_STRUCT_INIT(name.srcu, name.srcuu, pcpu), \
}
#define ATOMIC_NOTIFIER_HEAD(name) \
diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h
index 5aa5e0faf6a1..ebd72491af99 100644
--- a/include/linux/srcutiny.h
+++ b/include/linux/srcutiny.h
@@ -31,7 +31,7 @@ struct srcu_struct {
void srcu_drive_gp(struct work_struct *wp);
-#define __SRCU_STRUCT_INIT(name, __ignored) \
+#define __SRCU_STRUCT_INIT(name, __ignored, ___ignored) \
{ \
.srcu_wq = __SWAIT_QUEUE_HEAD_INITIALIZER(name.srcu_wq), \
.srcu_cb_tail = &name.srcu_cb_head, \
@@ -44,9 +44,9 @@ void srcu_drive_gp(struct work_struct *wp);
* Tree SRCU, which needs some per-CPU data.
*/
#define DEFINE_SRCU(name) \
- struct srcu_struct name = __SRCU_STRUCT_INIT(name, name)
+ struct srcu_struct name = __SRCU_STRUCT_INIT(name, name, name)
#define DEFINE_STATIC_SRCU(name) \
- static struct srcu_struct name = __SRCU_STRUCT_INIT(name, name)
+ static struct srcu_struct name = __SRCU_STRUCT_INIT(name, name, name)
void synchronize_srcu(struct srcu_struct *ssp);
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 3ce6deee1dbe..276f325f1296 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -57,11 +57,17 @@ struct srcu_node {
int grphi; /* Biggest CPU for node. */
};
+/*
+ * Per-SRCU-domain structure, update-side data linked from srcu_struct.
+ */
+struct srcu_usage {
+ struct srcu_node *node; /* Combining tree. */
+};
+
/*
* Per-SRCU-domain structure, similar in function to rcu_state.
*/
struct srcu_struct {
- struct srcu_node *node; /* Combining tree. */
struct srcu_node *level[RCU_NUM_LVLS + 1];
/* First node at each level. */
int srcu_size_state; /* Small-to-big transition state. */
@@ -90,6 +96,7 @@ struct srcu_struct {
unsigned long reschedule_count;
struct delayed_work work;
struct lockdep_map dep_map;
+ struct srcu_usage *srcu_sup; /* Update-side data. */
};
/* Values for size state variable (->srcu_size_state). */
@@ -108,24 +115,24 @@ struct srcu_struct {
#define SRCU_STATE_SCAN1 1
#define SRCU_STATE_SCAN2 2
-#define __SRCU_STRUCT_INIT_COMMON(name) \
+#define __SRCU_STRUCT_INIT_COMMON(name, usage_name) \
.lock = __SPIN_LOCK_UNLOCKED(name.lock), \
.srcu_gp_seq_needed = -1UL, \
.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0), \
+ .srcu_sup = &usage_name, \
__SRCU_DEP_MAP_INIT(name)
-#define __SRCU_STRUCT_INIT_MODULE(name) \
+#define __SRCU_STRUCT_INIT_MODULE(name, usage_name) \
{ \
- __SRCU_STRUCT_INIT_COMMON(name) \
+ __SRCU_STRUCT_INIT_COMMON(name, usage_name) \
}
-#define __SRCU_STRUCT_INIT(name, pcpu_name) \
+#define __SRCU_STRUCT_INIT(name, usage_name, pcpu_name) \
{ \
.sda = &pcpu_name, \
- __SRCU_STRUCT_INIT_COMMON(name) \
+ __SRCU_STRUCT_INIT_COMMON(name, usage_name) \
}
-
/*
* Define and initialize a srcu struct at build time.
* Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
@@ -147,15 +154,17 @@ struct srcu_struct {
*/
#ifdef MODULE
# define __DEFINE_SRCU(name, is_static) \
- is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name); \
+ static struct srcu_usage name##_srcu_usage; \
+ is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name, name##_srcu_usage); \
extern struct srcu_struct * const __srcu_struct_##name; \
struct srcu_struct * const __srcu_struct_##name \
__section("___srcu_struct_ptrs") = &name
#else
# define __DEFINE_SRCU(name, is_static) \
static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data); \
+ static struct srcu_usage name##_srcu_usage; \
is_static struct srcu_struct name = \
- __SRCU_STRUCT_INIT(name, name##_srcu_data)
+ __SRCU_STRUCT_INIT(name, name##_srcu_usage, name##_srcu_data)
#endif
#define DEFINE_SRCU(name) __DEFINE_SRCU(name, /* not static */)
#define DEFINE_STATIC_SRCU(name) __DEFINE_SRCU(name, static)
diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index 115616ac3bfa..8d18d4bf0e29 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -341,11 +341,13 @@ extern void rcu_init_geometry(void);
* specified state structure (for SRCU) or the only rcu_state structure
* (for RCU).
*/
-#define srcu_for_each_node_breadth_first(sp, rnp) \
+#define _rcu_for_each_node_breadth_first(sp, rnp) \
for ((rnp) = &(sp)->node[0]; \
(rnp) < &(sp)->node[rcu_num_nodes]; (rnp)++)
#define rcu_for_each_node_breadth_first(rnp) \
- srcu_for_each_node_breadth_first(&rcu_state, rnp)
+ _rcu_for_each_node_breadth_first(&rcu_state, rnp)
+#define srcu_for_each_node_breadth_first(ssp, rnp) \
+ _rcu_for_each_node_breadth_first(ssp->srcu_sup, rnp)
/*
* Scan the leaves of the rcu_node hierarchy for the rcu_state structure.
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 7e6e7dfb1a87..049e20dbec76 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -173,12 +173,12 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
/* Initialize geometry if it has not already been initialized. */
rcu_init_geometry();
- ssp->node = kcalloc(rcu_num_nodes, sizeof(*ssp->node), gfp_flags);
- if (!ssp->node)
+ ssp->srcu_sup->node = kcalloc(rcu_num_nodes, sizeof(*ssp->srcu_sup->node), gfp_flags);
+ if (!ssp->srcu_sup->node)
return false;
/* Work out the overall tree geometry. */
- ssp->level[0] = &ssp->node[0];
+ ssp->level[0] = &ssp->srcu_sup->node[0];
for (i = 1; i < rcu_num_lvls; i++)
ssp->level[i] = ssp->level[i - 1] + num_rcu_lvl[i - 1];
rcu_init_levelspread(levelspread, num_rcu_lvl);
@@ -195,7 +195,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
snp->srcu_gp_seq_needed_exp = SRCU_SNP_INIT_SEQ;
snp->grplo = -1;
snp->grphi = -1;
- if (snp == &ssp->node[0]) {
+ if (snp == &ssp->srcu_sup->node[0]) {
/* Root node, special case. */
snp->srcu_parent = NULL;
continue;
@@ -236,8 +236,12 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
*/
static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
{
+ if (!is_static)
+ ssp->srcu_sup = kzalloc(sizeof(*ssp->srcu_sup), GFP_KERNEL);
+ if (!ssp->srcu_sup)
+ return -ENOMEM;
ssp->srcu_size_state = SRCU_SIZE_SMALL;
- ssp->node = NULL;
+ ssp->srcu_sup->node = NULL;
mutex_init(&ssp->srcu_cb_mutex);
mutex_init(&ssp->srcu_gp_mutex);
ssp->srcu_idx = 0;
@@ -249,8 +253,11 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
ssp->sda_is_static = is_static;
if (!is_static)
ssp->sda = alloc_percpu(struct srcu_data);
- if (!ssp->sda)
+ if (!ssp->sda) {
+ if (!is_static)
+ kfree(ssp->srcu_sup);
return -ENOMEM;
+ }
init_srcu_struct_data(ssp);
ssp->srcu_gp_seq_needed_exp = 0;
ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
@@ -259,6 +266,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
if (!ssp->sda_is_static) {
free_percpu(ssp->sda);
ssp->sda = NULL;
+ kfree(ssp->srcu_sup);
return -ENOMEM;
}
} else {
@@ -656,13 +664,15 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
rcu_seq_current(&ssp->srcu_gp_seq), ssp->srcu_gp_seq_needed);
return; /* Caller forgot to stop doing call_srcu()? */
}
+ kfree(ssp->srcu_sup->node);
+ ssp->srcu_sup->node = NULL;
+ ssp->srcu_size_state = SRCU_SIZE_SMALL;
if (!ssp->sda_is_static) {
free_percpu(ssp->sda);
ssp->sda = NULL;
+ kfree(ssp->srcu_sup);
+ ssp->srcu_sup = NULL;
}
- kfree(ssp->node);
- ssp->node = NULL;
- ssp->srcu_size_state = SRCU_SIZE_SMALL;
}
EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
--
2.40.0.rc2
This is a whitespace-only commit with no change in functionality.
Its purpose is to prepare for later commits that: (1) Cause statically
allocated srcu_struct structures to rely on compile-time initialization
and (2) Move fields from the srcu_struct structure to a new srcu_usage
structure.
Cc: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
include/linux/srcutree.h | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 558057b517b7..488d0e5d1ba3 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -108,13 +108,13 @@ struct srcu_struct {
#define SRCU_STATE_SCAN1 1
#define SRCU_STATE_SCAN2 2
-#define __SRCU_STRUCT_INIT(name, pcpu_name) \
-{ \
- .sda = &pcpu_name, \
- .lock = __SPIN_LOCK_UNLOCKED(name.lock), \
- .srcu_gp_seq_needed = -1UL, \
- .work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0), \
- __SRCU_DEP_MAP_INIT(name) \
+#define __SRCU_STRUCT_INIT(name, pcpu_name) \
+{ \
+ .sda = &pcpu_name, \
+ .lock = __SPIN_LOCK_UNLOCKED(name.lock), \
+ .srcu_gp_seq_needed = -1UL, \
+ .work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0), \
+ __SRCU_DEP_MAP_INIT(name) \
}
/*
@@ -137,15 +137,15 @@ struct srcu_struct {
* See include/linux/percpu-defs.h for the rules on per-CPU variables.
*/
#ifdef MODULE
-# define __DEFINE_SRCU(name, is_static) \
- is_static struct srcu_struct name; \
- extern struct srcu_struct * const __srcu_struct_##name; \
- struct srcu_struct * const __srcu_struct_##name \
+# define __DEFINE_SRCU(name, is_static) \
+ is_static struct srcu_struct name; \
+ extern struct srcu_struct * const __srcu_struct_##name; \
+ struct srcu_struct * const __srcu_struct_##name \
__section("___srcu_struct_ptrs") = &name
#else
-# define __DEFINE_SRCU(name, is_static) \
- static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data); \
- is_static struct srcu_struct name = \
+# define __DEFINE_SRCU(name, is_static) \
+ static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data); \
+ is_static struct srcu_struct name = \
__SRCU_STRUCT_INIT(name, name##_srcu_data)
#endif
#define DEFINE_SRCU(name) __DEFINE_SRCU(name, /* not static */)
--
2.40.0.rc2
This commit moves the ->level[] array from the srcu_struct structure to
the srcu_usage structure to reduce the size of the former in order to
improve cache locality.
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
include/linux/srcutree.h | 4 ++--
kernel/rcu/srcutree.c | 14 +++++++-------
2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 276f325f1296..c7373fe5c14b 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -62,14 +62,14 @@ struct srcu_node {
*/
struct srcu_usage {
struct srcu_node *node; /* Combining tree. */
+ struct srcu_node *level[RCU_NUM_LVLS + 1];
+ /* First node at each level. */
};
/*
* Per-SRCU-domain structure, similar in function to rcu_state.
*/
struct srcu_struct {
- struct srcu_node *level[RCU_NUM_LVLS + 1];
- /* First node at each level. */
int srcu_size_state; /* Small-to-big transition state. */
struct mutex srcu_cb_mutex; /* Serialize CB preparation. */
spinlock_t __private lock; /* Protect counters and size state. */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 049e20dbec76..acb0862faafa 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -178,9 +178,9 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
return false;
/* Work out the overall tree geometry. */
- ssp->level[0] = &ssp->srcu_sup->node[0];
+ ssp->srcu_sup->level[0] = &ssp->srcu_sup->node[0];
for (i = 1; i < rcu_num_lvls; i++)
- ssp->level[i] = ssp->level[i - 1] + num_rcu_lvl[i - 1];
+ ssp->srcu_sup->level[i] = ssp->srcu_sup->level[i - 1] + num_rcu_lvl[i - 1];
rcu_init_levelspread(levelspread, num_rcu_lvl);
/* Each pass through this loop initializes one srcu_node structure. */
@@ -202,10 +202,10 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
}
/* Non-root node. */
- if (snp == ssp->level[level + 1])
+ if (snp == ssp->srcu_sup->level[level + 1])
level++;
- snp->srcu_parent = ssp->level[level - 1] +
- (snp - ssp->level[level]) /
+ snp->srcu_parent = ssp->srcu_sup->level[level - 1] +
+ (snp - ssp->srcu_sup->level[level]) /
levelspread[level - 1];
}
@@ -214,7 +214,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
* leaves of the srcu_node tree.
*/
level = rcu_num_lvls - 1;
- snp_first = ssp->level[level];
+ snp_first = ssp->srcu_sup->level[level];
for_each_possible_cpu(cpu) {
sdp = per_cpu_ptr(ssp->sda, cpu);
sdp->mynode = &snp_first[cpu / levelspread[level]];
@@ -889,7 +889,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
srcu_for_each_node_breadth_first(ssp, snp) {
spin_lock_irq_rcu_node(snp);
cbs = false;
- last_lvl = snp >= ssp->level[rcu_num_lvls - 1];
+ last_lvl = snp >= ssp->srcu_sup->level[rcu_num_lvls - 1];
if (last_lvl)
cbs = ss_state < SRCU_SIZE_BIG || snp->srcu_have_cbs[idx] == gpseq;
snp->srcu_have_cbs[idx] = gpseq;
--
2.40.0.rc2
This commit moves the ->srcu_size_jiffies, ->srcu_n_lock_retries,
and ->srcu_n_exp_nodelay fields from the srcu_struct structure to the
srcu_usage structure to reduce the size of the former in order to improve
cache locality.
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
include/linux/srcutree.h | 6 +++---
kernel/rcu/srcutree.c | 18 +++++++++---------
2 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 372e35b0e8b6..3023492d8d89 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -73,6 +73,9 @@ struct srcu_usage {
unsigned long srcu_gp_seq_needed_exp; /* Furthest future exp GP. */
unsigned long srcu_gp_start; /* Last GP start timestamp (jiffies) */
unsigned long srcu_last_gp_end; /* Last GP end timestamp (ns) */
+ unsigned long srcu_size_jiffies; /* Current contention-measurement interval. */
+ unsigned long srcu_n_lock_retries; /* Contention events in current interval. */
+ unsigned long srcu_n_exp_nodelay; /* # expedited no-delays in current GP phase. */
};
/*
@@ -80,9 +83,6 @@ struct srcu_usage {
*/
struct srcu_struct {
unsigned int srcu_idx; /* Current rdr array element. */
- unsigned long srcu_size_jiffies; /* Current contention-measurement interval. */
- unsigned long srcu_n_lock_retries; /* Contention events in current interval. */
- unsigned long srcu_n_exp_nodelay; /* # expedited no-delays in current GP phase. */
struct srcu_data __percpu *sda; /* Per-CPU srcu_data array. */
bool sda_is_static; /* May ->sda be passed to free_percpu()? */
unsigned long srcu_barrier_seq; /* srcu_barrier seq #. */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 340eb685cf64..291fb520bce0 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -348,11 +348,11 @@ static void spin_lock_irqsave_check_contention(struct srcu_struct *ssp)
if (!SRCU_SIZING_IS_CONTEND() || ssp->srcu_sup->srcu_size_state)
return;
j = jiffies;
- if (ssp->srcu_size_jiffies != j) {
- ssp->srcu_size_jiffies = j;
- ssp->srcu_n_lock_retries = 0;
+ if (ssp->srcu_sup->srcu_size_jiffies != j) {
+ ssp->srcu_sup->srcu_size_jiffies = j;
+ ssp->srcu_sup->srcu_n_lock_retries = 0;
}
- if (++ssp->srcu_n_lock_retries <= small_contention_lim)
+ if (++ssp->srcu_sup->srcu_n_lock_retries <= small_contention_lim)
return;
__srcu_transition_to_big(ssp);
}
@@ -624,8 +624,8 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
if (time_after(j, gpstart))
jbase += j - gpstart;
if (!jbase) {
- WRITE_ONCE(ssp->srcu_n_exp_nodelay, READ_ONCE(ssp->srcu_n_exp_nodelay) + 1);
- if (READ_ONCE(ssp->srcu_n_exp_nodelay) > srcu_max_nodelay_phase)
+ WRITE_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay, READ_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay) + 1);
+ if (READ_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay) > srcu_max_nodelay_phase)
jbase = 1;
}
}
@@ -783,7 +783,7 @@ static void srcu_gp_start(struct srcu_struct *ssp)
rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq));
spin_unlock_rcu_node(sdp); /* Interrupts remain disabled. */
WRITE_ONCE(ssp->srcu_sup->srcu_gp_start, jiffies);
- WRITE_ONCE(ssp->srcu_n_exp_nodelay, 0);
+ WRITE_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay, 0);
smp_mb(); /* Order prior store to ->srcu_gp_seq_needed vs. GP start. */
rcu_seq_start(&ssp->srcu_sup->srcu_gp_seq);
state = rcu_seq_state(ssp->srcu_sup->srcu_gp_seq);
@@ -1625,7 +1625,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
srcu_flip(ssp);
spin_lock_irq_rcu_node(ssp->srcu_sup);
rcu_seq_set_state(&ssp->srcu_sup->srcu_gp_seq, SRCU_STATE_SCAN2);
- ssp->srcu_n_exp_nodelay = 0;
+ ssp->srcu_sup->srcu_n_exp_nodelay = 0;
spin_unlock_irq_rcu_node(ssp->srcu_sup);
}
@@ -1640,7 +1640,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
return; /* readers present, retry later. */
}
- ssp->srcu_n_exp_nodelay = 0;
+ ssp->srcu_sup->srcu_n_exp_nodelay = 0;
srcu_gp_end(ssp); /* Releases ->srcu_gp_mutex. */
}
}
--
2.40.0.rc2
Currently, both __init_srcu_struct() in CONFIG_DEBUG_LOCK_ALLOC=y kernels
and init_srcu_struct() in CONFIG_DEBUG_LOCK_ALLOC=n kernel initialize
the srcu_struct structure's ->lock before the srcu_usage structure has
been allocated. This of course prevents the ->lock from being moved
to the srcu_usage structure, so this commit moves the initialization
into the init_srcu_struct_fields() after the srcu_usage structure has
been allocated.
Cc: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/srcutree.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 1814f3bfc219..c2a024a60f1a 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -240,6 +240,8 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
ssp->srcu_sup = kzalloc(sizeof(*ssp->srcu_sup), GFP_KERNEL);
if (!ssp->srcu_sup)
return -ENOMEM;
+ if (!is_static)
+ spin_lock_init(&ACCESS_PRIVATE(ssp, lock));
ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
ssp->srcu_sup->node = NULL;
mutex_init(&ssp->srcu_sup->srcu_cb_mutex);
@@ -285,7 +287,6 @@ int __init_srcu_struct(struct srcu_struct *ssp, const char *name,
/* Don't re-initialize a lock while it is held. */
debug_check_no_locks_freed((void *)ssp, sizeof(*ssp));
lockdep_init_map(&ssp->dep_map, name, key, 0);
- spin_lock_init(&ACCESS_PRIVATE(ssp, lock));
return init_srcu_struct_fields(ssp, false);
}
EXPORT_SYMBOL_GPL(__init_srcu_struct);
@@ -302,7 +303,6 @@ EXPORT_SYMBOL_GPL(__init_srcu_struct);
*/
int init_srcu_struct(struct srcu_struct *ssp)
{
- spin_lock_init(&ACCESS_PRIVATE(ssp, lock));
return init_srcu_struct_fields(ssp, false);
}
EXPORT_SYMBOL_GPL(init_srcu_struct);
--
2.40.0.rc2
This commit moves the ->srcu_size_state field from the srcu_struct
structure to the srcu_usage structure to reduce the size of the former
in order to improve cache locality.
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
include/linux/srcutree.h | 2 +-
kernel/rcu/srcutree.c | 37 +++++++++++++++++++------------------
2 files changed, 20 insertions(+), 19 deletions(-)
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index c7373fe5c14b..443d27a214ef 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -64,13 +64,13 @@ struct srcu_usage {
struct srcu_node *node; /* Combining tree. */
struct srcu_node *level[RCU_NUM_LVLS + 1];
/* First node at each level. */
+ int srcu_size_state; /* Small-to-big transition state. */
};
/*
* Per-SRCU-domain structure, similar in function to rcu_state.
*/
struct srcu_struct {
- int srcu_size_state; /* Small-to-big transition state. */
struct mutex srcu_cb_mutex; /* Serialize CB preparation. */
spinlock_t __private lock; /* Protect counters and size state. */
struct mutex srcu_gp_mutex; /* Serialize GP work. */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index acb0862faafa..8428a184d506 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -225,7 +225,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
}
sdp->grpmask = 1 << (cpu - sdp->mynode->grplo);
}
- smp_store_release(&ssp->srcu_size_state, SRCU_SIZE_WAIT_BARRIER);
+ smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_WAIT_BARRIER);
return true;
}
@@ -240,7 +240,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
ssp->srcu_sup = kzalloc(sizeof(*ssp->srcu_sup), GFP_KERNEL);
if (!ssp->srcu_sup)
return -ENOMEM;
- ssp->srcu_size_state = SRCU_SIZE_SMALL;
+ ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
ssp->srcu_sup->node = NULL;
mutex_init(&ssp->srcu_cb_mutex);
mutex_init(&ssp->srcu_gp_mutex);
@@ -261,7 +261,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
init_srcu_struct_data(ssp);
ssp->srcu_gp_seq_needed_exp = 0;
ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
- if (READ_ONCE(ssp->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
+ if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) {
if (!ssp->sda_is_static) {
free_percpu(ssp->sda);
@@ -270,7 +270,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
return -ENOMEM;
}
} else {
- WRITE_ONCE(ssp->srcu_size_state, SRCU_SIZE_BIG);
+ WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG);
}
}
smp_store_release(&ssp->srcu_gp_seq_needed, 0); /* Init done. */
@@ -315,7 +315,7 @@ EXPORT_SYMBOL_GPL(init_srcu_struct);
static void __srcu_transition_to_big(struct srcu_struct *ssp)
{
lockdep_assert_held(&ACCESS_PRIVATE(ssp, lock));
- smp_store_release(&ssp->srcu_size_state, SRCU_SIZE_ALLOC);
+ smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_ALLOC);
}
/*
@@ -326,10 +326,10 @@ static void srcu_transition_to_big(struct srcu_struct *ssp)
unsigned long flags;
/* Double-checked locking on ->srcu_size-state. */
- if (smp_load_acquire(&ssp->srcu_size_state) != SRCU_SIZE_SMALL)
+ if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) != SRCU_SIZE_SMALL)
return;
spin_lock_irqsave_rcu_node(ssp, flags);
- if (smp_load_acquire(&ssp->srcu_size_state) != SRCU_SIZE_SMALL) {
+ if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) != SRCU_SIZE_SMALL) {
spin_unlock_irqrestore_rcu_node(ssp, flags);
return;
}
@@ -345,7 +345,7 @@ static void spin_lock_irqsave_check_contention(struct srcu_struct *ssp)
{
unsigned long j;
- if (!SRCU_SIZING_IS_CONTEND() || ssp->srcu_size_state)
+ if (!SRCU_SIZING_IS_CONTEND() || ssp->srcu_sup->srcu_size_state)
return;
j = jiffies;
if (ssp->srcu_size_jiffies != j) {
@@ -666,7 +666,7 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
}
kfree(ssp->srcu_sup->node);
ssp->srcu_sup->node = NULL;
- ssp->srcu_size_state = SRCU_SIZE_SMALL;
+ ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
if (!ssp->sda_is_static) {
free_percpu(ssp->sda);
ssp->sda = NULL;
@@ -770,7 +770,7 @@ static void srcu_gp_start(struct srcu_struct *ssp)
struct srcu_data *sdp;
int state;
- if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
+ if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id());
else
sdp = this_cpu_ptr(ssp->sda);
@@ -880,7 +880,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
/* A new grace period can start at this point. But only one. */
/* Initiate callback invocation as needed. */
- ss_state = smp_load_acquire(&ssp->srcu_size_state);
+ ss_state = smp_load_acquire(&ssp->srcu_sup->srcu_size_state);
if (ss_state < SRCU_SIZE_WAIT_BARRIER) {
srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, get_boot_cpu_id()),
cbdelay);
@@ -940,7 +940,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
if (ss_state == SRCU_SIZE_ALLOC)
init_srcu_struct_nodes(ssp, GFP_KERNEL);
else
- smp_store_release(&ssp->srcu_size_state, ss_state + 1);
+ smp_store_release(&ssp->srcu_sup->srcu_size_state, ss_state + 1);
}
}
@@ -1002,7 +1002,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
unsigned long snp_seq;
/* Ensure that snp node tree is fully initialized before traversing it */
- if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
+ if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
snp_leaf = NULL;
else
snp_leaf = sdp->mynode;
@@ -1209,7 +1209,7 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
* sequence number cannot wrap around in the meantime.
*/
idx = __srcu_read_lock_nmisafe(ssp);
- ss_state = smp_load_acquire(&ssp->srcu_size_state);
+ ss_state = smp_load_acquire(&ssp->srcu_sup->srcu_size_state);
if (ss_state < SRCU_SIZE_WAIT_CALL)
sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id());
else
@@ -1546,7 +1546,7 @@ void srcu_barrier(struct srcu_struct *ssp)
atomic_set(&ssp->srcu_barrier_cpu_cnt, 1);
idx = __srcu_read_lock_nmisafe(ssp);
- if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
+ if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
srcu_barrier_one_cpu(ssp, per_cpu_ptr(ssp->sda, get_boot_cpu_id()));
else
for_each_possible_cpu(cpu)
@@ -1784,7 +1784,7 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf)
int cpu;
int idx;
unsigned long s0 = 0, s1 = 0;
- int ss_state = READ_ONCE(ssp->srcu_size_state);
+ int ss_state = READ_ONCE(ssp->srcu_sup->srcu_size_state);
int ss_state_idx = ss_state;
idx = ssp->srcu_idx & 0x1;
@@ -1871,8 +1871,9 @@ void __init srcu_init(void)
ssp = list_first_entry(&srcu_boot_list, struct srcu_struct,
work.work.entry);
list_del_init(&ssp->work.work.entry);
- if (SRCU_SIZING_IS(SRCU_SIZING_INIT) && ssp->srcu_size_state == SRCU_SIZE_SMALL)
- ssp->srcu_size_state = SRCU_SIZE_ALLOC;
+ if (SRCU_SIZING_IS(SRCU_SIZING_INIT) &&
+ ssp->srcu_sup->srcu_size_state == SRCU_SIZE_SMALL)
+ ssp->srcu_sup->srcu_size_state = SRCU_SIZE_ALLOC;
queue_work(rcu_gp_wq, &ssp->work.work);
}
}
--
2.40.0.rc2
This commit moves the ->srcu_gp_mutex field from the srcu_struct structure
to the srcu_usage structure to reduce the size of the former in order
to improve cache locality.
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
include/linux/srcutree.h | 2 +-
kernel/rcu/srcutree.c | 14 +++++++-------
2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 694d87b81917..d04e3da6181c 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -67,13 +67,13 @@ struct srcu_usage {
int srcu_size_state; /* Small-to-big transition state. */
struct mutex srcu_cb_mutex; /* Serialize CB preparation. */
spinlock_t __private lock; /* Protect counters and size state. */
+ struct mutex srcu_gp_mutex; /* Serialize GP work. */
};
/*
* Per-SRCU-domain structure, similar in function to rcu_state.
*/
struct srcu_struct {
- struct mutex srcu_gp_mutex; /* Serialize GP work. */
unsigned int srcu_idx; /* Current rdr array element. */
unsigned long srcu_gp_seq; /* Grace-period seq #. */
unsigned long srcu_gp_seq_needed; /* Latest gp_seq needed. */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index c42248cf18f6..a36066798de7 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -245,7 +245,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
ssp->srcu_sup->node = NULL;
mutex_init(&ssp->srcu_sup->srcu_cb_mutex);
- mutex_init(&ssp->srcu_gp_mutex);
+ mutex_init(&ssp->srcu_sup->srcu_gp_mutex);
ssp->srcu_idx = 0;
ssp->srcu_gp_seq = 0;
ssp->srcu_barrier_seq = 0;
@@ -876,7 +876,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, gpseq))
WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, gpseq);
spin_unlock_irq_rcu_node(ssp->srcu_sup);
- mutex_unlock(&ssp->srcu_gp_mutex);
+ mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
/* A new grace period can start at this point. But only one. */
/* Initiate callback invocation as needed. */
@@ -1585,7 +1585,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
{
int idx;
- mutex_lock(&ssp->srcu_gp_mutex);
+ mutex_lock(&ssp->srcu_sup->srcu_gp_mutex);
/*
* Because readers might be delayed for an extended period after
@@ -1603,7 +1603,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
if (ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed)) {
WARN_ON_ONCE(rcu_seq_state(ssp->srcu_gp_seq));
spin_unlock_irq_rcu_node(ssp->srcu_sup);
- mutex_unlock(&ssp->srcu_gp_mutex);
+ mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
return;
}
idx = rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq));
@@ -1611,7 +1611,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
srcu_gp_start(ssp);
spin_unlock_irq_rcu_node(ssp->srcu_sup);
if (idx != SRCU_STATE_IDLE) {
- mutex_unlock(&ssp->srcu_gp_mutex);
+ mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
return; /* Someone else started the grace period. */
}
}
@@ -1619,7 +1619,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) == SRCU_STATE_SCAN1) {
idx = 1 ^ (ssp->srcu_idx & 1);
if (!try_check_zero(ssp, idx, 1)) {
- mutex_unlock(&ssp->srcu_gp_mutex);
+ mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
return; /* readers present, retry later. */
}
srcu_flip(ssp);
@@ -1637,7 +1637,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
*/
idx = 1 ^ (ssp->srcu_idx & 1);
if (!try_check_zero(ssp, idx, 2)) {
- mutex_unlock(&ssp->srcu_gp_mutex);
+ mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
return; /* readers present, retry later. */
}
ssp->srcu_n_exp_nodelay = 0;
--
2.40.0.rc2
This commit moves the ->lock field from the srcu_struct structure to
the srcu_usage structure to reduce the size of the former in order to
improve cache locality.
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
include/linux/srcutree.h | 11 +++++---
kernel/rcu/srcutree.c | 56 ++++++++++++++++++++--------------------
2 files changed, 35 insertions(+), 32 deletions(-)
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 231de66ceb15..694d87b81917 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -66,13 +66,13 @@ struct srcu_usage {
/* First node at each level. */
int srcu_size_state; /* Small-to-big transition state. */
struct mutex srcu_cb_mutex; /* Serialize CB preparation. */
+ spinlock_t __private lock; /* Protect counters and size state. */
};
/*
* Per-SRCU-domain structure, similar in function to rcu_state.
*/
struct srcu_struct {
- spinlock_t __private lock; /* Protect counters and size state. */
struct mutex srcu_gp_mutex; /* Serialize GP work. */
unsigned int srcu_idx; /* Current rdr array element. */
unsigned long srcu_gp_seq; /* Grace-period seq #. */
@@ -116,7 +116,6 @@ struct srcu_struct {
#define SRCU_STATE_SCAN2 2
#define __SRCU_STRUCT_INIT_COMMON(name, usage_name) \
- .lock = __SPIN_LOCK_UNLOCKED(name.lock), \
.srcu_gp_seq_needed = -1UL, \
.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0), \
.srcu_sup = &usage_name, \
@@ -154,7 +153,9 @@ struct srcu_struct {
*/
#ifdef MODULE
# define __DEFINE_SRCU(name, is_static) \
- static struct srcu_usage name##_srcu_usage; \
+ static struct srcu_usage name##_srcu_usage = { \
+ .lock = __SPIN_LOCK_UNLOCKED(name.lock), \
+ }; \
is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name, name##_srcu_usage); \
extern struct srcu_struct * const __srcu_struct_##name; \
struct srcu_struct * const __srcu_struct_##name \
@@ -162,7 +163,9 @@ struct srcu_struct {
#else
# define __DEFINE_SRCU(name, is_static) \
static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data); \
- static struct srcu_usage name##_srcu_usage; \
+ static struct srcu_usage name##_srcu_usage = { \
+ .lock = __SPIN_LOCK_UNLOCKED(name.lock), \
+ }; \
is_static struct srcu_struct name = \
__SRCU_STRUCT_INIT(name, name##_srcu_usage, name##_srcu_data)
#endif
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index c2a024a60f1a..c42248cf18f6 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -103,7 +103,7 @@ do { \
#define spin_trylock_irqsave_rcu_node(p, flags) \
({ \
- bool ___locked = spin_trylock_irqsave(&ACCESS_PRIVATE(p, lock), flags); \
+ bool ___locked = spin_trylock_irqsave(&ACCESS_PRIVATE(p, lock), flags); \
\
if (___locked) \
smp_mb__after_unlock_lock(); \
@@ -241,7 +241,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
if (!ssp->srcu_sup)
return -ENOMEM;
if (!is_static)
- spin_lock_init(&ACCESS_PRIVATE(ssp, lock));
+ spin_lock_init(&ACCESS_PRIVATE(ssp->srcu_sup, lock));
ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
ssp->srcu_sup->node = NULL;
mutex_init(&ssp->srcu_sup->srcu_cb_mutex);
@@ -314,7 +314,7 @@ EXPORT_SYMBOL_GPL(init_srcu_struct);
*/
static void __srcu_transition_to_big(struct srcu_struct *ssp)
{
- lockdep_assert_held(&ACCESS_PRIVATE(ssp, lock));
+ lockdep_assert_held(&ACCESS_PRIVATE(ssp->srcu_sup, lock));
smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_ALLOC);
}
@@ -328,13 +328,13 @@ static void srcu_transition_to_big(struct srcu_struct *ssp)
/* Double-checked locking on ->srcu_size-state. */
if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) != SRCU_SIZE_SMALL)
return;
- spin_lock_irqsave_rcu_node(ssp, flags);
+ spin_lock_irqsave_rcu_node(ssp->srcu_sup, flags);
if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) != SRCU_SIZE_SMALL) {
- spin_unlock_irqrestore_rcu_node(ssp, flags);
+ spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
return;
}
__srcu_transition_to_big(ssp);
- spin_unlock_irqrestore_rcu_node(ssp, flags);
+ spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
}
/*
@@ -369,9 +369,9 @@ static void spin_lock_irqsave_sdp_contention(struct srcu_data *sdp, unsigned lon
if (spin_trylock_irqsave_rcu_node(sdp, *flags))
return;
- spin_lock_irqsave_rcu_node(ssp, *flags);
+ spin_lock_irqsave_rcu_node(ssp->srcu_sup, *flags);
spin_lock_irqsave_check_contention(ssp);
- spin_unlock_irqrestore_rcu_node(ssp, *flags);
+ spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, *flags);
spin_lock_irqsave_rcu_node(sdp, *flags);
}
@@ -383,9 +383,9 @@ static void spin_lock_irqsave_sdp_contention(struct srcu_data *sdp, unsigned lon
*/
static void spin_lock_irqsave_ssp_contention(struct srcu_struct *ssp, unsigned long *flags)
{
- if (spin_trylock_irqsave_rcu_node(ssp, *flags))
+ if (spin_trylock_irqsave_rcu_node(ssp->srcu_sup, *flags))
return;
- spin_lock_irqsave_rcu_node(ssp, *flags);
+ spin_lock_irqsave_rcu_node(ssp->srcu_sup, *flags);
spin_lock_irqsave_check_contention(ssp);
}
@@ -404,13 +404,13 @@ static void check_init_srcu_struct(struct srcu_struct *ssp)
/* The smp_load_acquire() pairs with the smp_store_release(). */
if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_gp_seq_needed))) /*^^^*/
return; /* Already initialized. */
- spin_lock_irqsave_rcu_node(ssp, flags);
+ spin_lock_irqsave_rcu_node(ssp->srcu_sup, flags);
if (!rcu_seq_state(ssp->srcu_gp_seq_needed)) {
- spin_unlock_irqrestore_rcu_node(ssp, flags);
+ spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
return;
}
init_srcu_struct_fields(ssp, true);
- spin_unlock_irqrestore_rcu_node(ssp, flags);
+ spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
}
/*
@@ -774,7 +774,7 @@ static void srcu_gp_start(struct srcu_struct *ssp)
sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id());
else
sdp = this_cpu_ptr(ssp->sda);
- lockdep_assert_held(&ACCESS_PRIVATE(ssp, lock));
+ lockdep_assert_held(&ACCESS_PRIVATE(ssp->srcu_sup, lock));
WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed));
spin_lock_rcu_node(sdp); /* Interrupts already disabled. */
rcu_segcblist_advance(&sdp->srcu_cblist,
@@ -864,7 +864,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
mutex_lock(&ssp->srcu_sup->srcu_cb_mutex);
/* End the current grace period. */
- spin_lock_irq_rcu_node(ssp);
+ spin_lock_irq_rcu_node(ssp->srcu_sup);
idx = rcu_seq_state(ssp->srcu_gp_seq);
WARN_ON_ONCE(idx != SRCU_STATE_SCAN2);
if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq), READ_ONCE(ssp->srcu_gp_seq_needed_exp)))
@@ -875,7 +875,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
gpseq = rcu_seq_current(&ssp->srcu_gp_seq);
if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, gpseq))
WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, gpseq);
- spin_unlock_irq_rcu_node(ssp);
+ spin_unlock_irq_rcu_node(ssp->srcu_sup);
mutex_unlock(&ssp->srcu_gp_mutex);
/* A new grace period can start at this point. But only one. */
@@ -924,15 +924,15 @@ static void srcu_gp_end(struct srcu_struct *ssp)
mutex_unlock(&ssp->srcu_sup->srcu_cb_mutex);
/* Start a new grace period if needed. */
- spin_lock_irq_rcu_node(ssp);
+ spin_lock_irq_rcu_node(ssp->srcu_sup);
gpseq = rcu_seq_current(&ssp->srcu_gp_seq);
if (!rcu_seq_state(gpseq) &&
ULONG_CMP_LT(gpseq, ssp->srcu_gp_seq_needed)) {
srcu_gp_start(ssp);
- spin_unlock_irq_rcu_node(ssp);
+ spin_unlock_irq_rcu_node(ssp->srcu_sup);
srcu_reschedule(ssp, 0);
} else {
- spin_unlock_irq_rcu_node(ssp);
+ spin_unlock_irq_rcu_node(ssp->srcu_sup);
}
/* Transition to big if needed. */
@@ -975,7 +975,7 @@ static void srcu_funnel_exp_start(struct srcu_struct *ssp, struct srcu_node *snp
spin_lock_irqsave_ssp_contention(ssp, &flags);
if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, s))
WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, s);
- spin_unlock_irqrestore_rcu_node(ssp, flags);
+ spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
}
/*
@@ -1064,7 +1064,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
else if (list_empty(&ssp->work.work.entry))
list_add(&ssp->work.work.entry, &srcu_boot_list);
}
- spin_unlock_irqrestore_rcu_node(ssp, flags);
+ spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
}
/*
@@ -1599,17 +1599,17 @@ static void srcu_advance_state(struct srcu_struct *ssp)
*/
idx = rcu_seq_state(smp_load_acquire(&ssp->srcu_gp_seq)); /* ^^^ */
if (idx == SRCU_STATE_IDLE) {
- spin_lock_irq_rcu_node(ssp);
+ spin_lock_irq_rcu_node(ssp->srcu_sup);
if (ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed)) {
WARN_ON_ONCE(rcu_seq_state(ssp->srcu_gp_seq));
- spin_unlock_irq_rcu_node(ssp);
+ spin_unlock_irq_rcu_node(ssp->srcu_sup);
mutex_unlock(&ssp->srcu_gp_mutex);
return;
}
idx = rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq));
if (idx == SRCU_STATE_IDLE)
srcu_gp_start(ssp);
- spin_unlock_irq_rcu_node(ssp);
+ spin_unlock_irq_rcu_node(ssp->srcu_sup);
if (idx != SRCU_STATE_IDLE) {
mutex_unlock(&ssp->srcu_gp_mutex);
return; /* Someone else started the grace period. */
@@ -1623,10 +1623,10 @@ static void srcu_advance_state(struct srcu_struct *ssp)
return; /* readers present, retry later. */
}
srcu_flip(ssp);
- spin_lock_irq_rcu_node(ssp);
+ spin_lock_irq_rcu_node(ssp->srcu_sup);
rcu_seq_set_state(&ssp->srcu_gp_seq, SRCU_STATE_SCAN2);
ssp->srcu_n_exp_nodelay = 0;
- spin_unlock_irq_rcu_node(ssp);
+ spin_unlock_irq_rcu_node(ssp->srcu_sup);
}
if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) == SRCU_STATE_SCAN2) {
@@ -1710,7 +1710,7 @@ static void srcu_reschedule(struct srcu_struct *ssp, unsigned long delay)
{
bool pushgp = true;
- spin_lock_irq_rcu_node(ssp);
+ spin_lock_irq_rcu_node(ssp->srcu_sup);
if (ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed)) {
if (!WARN_ON_ONCE(rcu_seq_state(ssp->srcu_gp_seq))) {
/* All requests fulfilled, time to go idle. */
@@ -1720,7 +1720,7 @@ static void srcu_reschedule(struct srcu_struct *ssp, unsigned long delay)
/* Outstanding request and no GP. Start one. */
srcu_gp_start(ssp);
}
- spin_unlock_irq_rcu_node(ssp);
+ spin_unlock_irq_rcu_node(ssp->srcu_sup);
if (pushgp)
queue_delayed_work(rcu_gp_wq, &ssp->work, delay);
--
2.40.0.rc2
If a given statically allocated in-module srcu_struct structure was ever
used for updates, srcu_module_going() will invoke cleanup_srcu_struct()
at module-exit time. This will check for the error case of SRCU readers
persisting past module-exit time. On the other hand, if this srcu_struct
structure never went through a grace period, srcu_module_going() only
invokes free_percpu(), which would result in strange failures if SRCU
readers persisted past module-exit time.
This commit therefore adds a srcu_readers_active() check to
srcu_module_going(), splatting if readers have persisted and refraining
from invoking free_percpu() in that case. Better to leak memory than
to suffer silent memory corruption!
[ paulmck: Apply Zhang, Qiang1 feedback on memory leak. ]
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/srcutree.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 169a6513b739..f9dd6ed5503e 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1911,7 +1911,8 @@ static void srcu_module_going(struct module *mod)
if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
!WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
cleanup_srcu_struct(ssp);
- free_percpu(ssp->sda);
+ if (!WARN_ON(srcu_readers_active(ssp)))
+ free_percpu(ssp->sda);
}
}
--
2.40.0.rc2
This commit moves the ->sda_is_static field from the srcu_struct structure
to the srcu_usage structure to reduce the size of the former in order
to improve cache locality.
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
include/linux/srcutree.h | 2 +-
kernel/rcu/srcutree.c | 8 ++++----
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 3023492d8d89..d3534ecb806e 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -76,6 +76,7 @@ struct srcu_usage {
unsigned long srcu_size_jiffies; /* Current contention-measurement interval. */
unsigned long srcu_n_lock_retries; /* Contention events in current interval. */
unsigned long srcu_n_exp_nodelay; /* # expedited no-delays in current GP phase. */
+ bool sda_is_static; /* May ->sda be passed to free_percpu()? */
};
/*
@@ -84,7 +85,6 @@ struct srcu_usage {
struct srcu_struct {
unsigned int srcu_idx; /* Current rdr array element. */
struct srcu_data __percpu *sda; /* Per-CPU srcu_data array. */
- bool sda_is_static; /* May ->sda be passed to free_percpu()? */
unsigned long srcu_barrier_seq; /* srcu_barrier seq #. */
struct mutex srcu_barrier_mutex; /* Serialize barrier ops. */
struct completion srcu_barrier_completion;
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 291fb520bce0..20f2373f7e25 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -252,7 +252,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
mutex_init(&ssp->srcu_barrier_mutex);
atomic_set(&ssp->srcu_barrier_cpu_cnt, 0);
INIT_DELAYED_WORK(&ssp->work, process_srcu);
- ssp->sda_is_static = is_static;
+ ssp->srcu_sup->sda_is_static = is_static;
if (!is_static)
ssp->sda = alloc_percpu(struct srcu_data);
if (!ssp->sda) {
@@ -265,7 +265,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
ssp->srcu_sup->srcu_last_gp_end = ktime_get_mono_fast_ns();
if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) {
- if (!ssp->sda_is_static) {
+ if (!ssp->srcu_sup->sda_is_static) {
free_percpu(ssp->sda);
ssp->sda = NULL;
kfree(ssp->srcu_sup);
@@ -667,7 +667,7 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
kfree(ssp->srcu_sup->node);
ssp->srcu_sup->node = NULL;
ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
- if (!ssp->sda_is_static) {
+ if (!ssp->srcu_sup->sda_is_static) {
free_percpu(ssp->sda);
ssp->sda = NULL;
kfree(ssp->srcu_sup);
@@ -1906,7 +1906,7 @@ static void srcu_module_going(struct module *mod)
for (i = 0; i < mod->num_srcu_structs; i++) {
ssp = *(sspp++);
if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
- !WARN_ON_ONCE(!ssp->sda_is_static))
+ !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
cleanup_srcu_struct(ssp);
free_percpu(ssp->sda);
}
--
2.40.0.rc2
This commit creates an srcu_usage pointer named "sup" as a shorter
synonym for the "ssp->srcu_sup" that was bloating several lines of code.
Cc: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/srcutree.c | 41 +++++++++++++++++++++--------------------
1 file changed, 21 insertions(+), 20 deletions(-)
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 11a08201ca0a..f661a0f6bc0d 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -862,28 +862,29 @@ static void srcu_gp_end(struct srcu_struct *ssp)
unsigned long sgsne;
struct srcu_node *snp;
int ss_state;
+ struct srcu_usage *sup = ssp->srcu_sup;
/* Prevent more than one additional grace period. */
- mutex_lock(&ssp->srcu_sup->srcu_cb_mutex);
+ mutex_lock(&sup->srcu_cb_mutex);
/* End the current grace period. */
- spin_lock_irq_rcu_node(ssp->srcu_sup);
- idx = rcu_seq_state(ssp->srcu_sup->srcu_gp_seq);
+ spin_lock_irq_rcu_node(sup);
+ idx = rcu_seq_state(sup->srcu_gp_seq);
WARN_ON_ONCE(idx != SRCU_STATE_SCAN2);
- if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_sup->srcu_gp_seq), READ_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp)))
+ if (ULONG_CMP_LT(READ_ONCE(sup->srcu_gp_seq), READ_ONCE(sup->srcu_gp_seq_needed_exp)))
cbdelay = 0;
- WRITE_ONCE(ssp->srcu_sup->srcu_last_gp_end, ktime_get_mono_fast_ns());
- rcu_seq_end(&ssp->srcu_sup->srcu_gp_seq);
- gpseq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
- if (ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed_exp, gpseq))
- WRITE_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp, gpseq);
- spin_unlock_irq_rcu_node(ssp->srcu_sup);
- mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
+ WRITE_ONCE(sup->srcu_last_gp_end, ktime_get_mono_fast_ns());
+ rcu_seq_end(&sup->srcu_gp_seq);
+ gpseq = rcu_seq_current(&sup->srcu_gp_seq);
+ if (ULONG_CMP_LT(sup->srcu_gp_seq_needed_exp, gpseq))
+ WRITE_ONCE(sup->srcu_gp_seq_needed_exp, gpseq);
+ spin_unlock_irq_rcu_node(sup);
+ mutex_unlock(&sup->srcu_gp_mutex);
/* A new grace period can start at this point. But only one. */
/* Initiate callback invocation as needed. */
- ss_state = smp_load_acquire(&ssp->srcu_sup->srcu_size_state);
+ ss_state = smp_load_acquire(&sup->srcu_size_state);
if (ss_state < SRCU_SIZE_WAIT_BARRIER) {
srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, get_boot_cpu_id()),
cbdelay);
@@ -892,7 +893,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
srcu_for_each_node_breadth_first(ssp, snp) {
spin_lock_irq_rcu_node(snp);
cbs = false;
- last_lvl = snp >= ssp->srcu_sup->level[rcu_num_lvls - 1];
+ last_lvl = snp >= sup->level[rcu_num_lvls - 1];
if (last_lvl)
cbs = ss_state < SRCU_SIZE_BIG || snp->srcu_have_cbs[idx] == gpseq;
snp->srcu_have_cbs[idx] = gpseq;
@@ -924,18 +925,18 @@ static void srcu_gp_end(struct srcu_struct *ssp)
}
/* Callback initiation done, allow grace periods after next. */
- mutex_unlock(&ssp->srcu_sup->srcu_cb_mutex);
+ mutex_unlock(&sup->srcu_cb_mutex);
/* Start a new grace period if needed. */
- spin_lock_irq_rcu_node(ssp->srcu_sup);
- gpseq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
+ spin_lock_irq_rcu_node(sup);
+ gpseq = rcu_seq_current(&sup->srcu_gp_seq);
if (!rcu_seq_state(gpseq) &&
- ULONG_CMP_LT(gpseq, ssp->srcu_sup->srcu_gp_seq_needed)) {
+ ULONG_CMP_LT(gpseq, sup->srcu_gp_seq_needed)) {
srcu_gp_start(ssp);
- spin_unlock_irq_rcu_node(ssp->srcu_sup);
+ spin_unlock_irq_rcu_node(sup);
srcu_reschedule(ssp, 0);
} else {
- spin_unlock_irq_rcu_node(ssp->srcu_sup);
+ spin_unlock_irq_rcu_node(sup);
}
/* Transition to big if needed. */
@@ -943,7 +944,7 @@ static void srcu_gp_end(struct srcu_struct *ssp)
if (ss_state == SRCU_SIZE_ALLOC)
init_srcu_struct_nodes(ssp, GFP_KERNEL);
else
- smp_store_release(&ssp->srcu_sup->srcu_size_state, ss_state + 1);
+ smp_store_release(&sup->srcu_size_state, ss_state + 1);
}
}
--
2.40.0.rc2
This commit moves the ->srcu_barrier_seq, ->srcu_barrier_mutex,
->srcu_barrier_completion, and ->srcu_barrier_cpu_cnt fields from the
srcu_struct structure to the srcu_usage structure to reduce the size of
the former in order to improve cache locality.
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
include/linux/srcutree.h | 14 +++++++-------
kernel/rcu/srcutree.c | 38 +++++++++++++++++++-------------------
2 files changed, 26 insertions(+), 26 deletions(-)
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index d3534ecb806e..d544ec1c0c8e 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -77,6 +77,13 @@ struct srcu_usage {
unsigned long srcu_n_lock_retries; /* Contention events in current interval. */
unsigned long srcu_n_exp_nodelay; /* # expedited no-delays in current GP phase. */
bool sda_is_static; /* May ->sda be passed to free_percpu()? */
+ unsigned long srcu_barrier_seq; /* srcu_barrier seq #. */
+ struct mutex srcu_barrier_mutex; /* Serialize barrier ops. */
+ struct completion srcu_barrier_completion;
+ /* Awaken barrier rq at end. */
+ atomic_t srcu_barrier_cpu_cnt; /* # CPUs not yet posting a */
+ /* callback for the barrier */
+ /* operation. */
};
/*
@@ -85,13 +92,6 @@ struct srcu_usage {
struct srcu_struct {
unsigned int srcu_idx; /* Current rdr array element. */
struct srcu_data __percpu *sda; /* Per-CPU srcu_data array. */
- unsigned long srcu_barrier_seq; /* srcu_barrier seq #. */
- struct mutex srcu_barrier_mutex; /* Serialize barrier ops. */
- struct completion srcu_barrier_completion;
- /* Awaken barrier rq at end. */
- atomic_t srcu_barrier_cpu_cnt; /* # CPUs not yet posting a */
- /* callback for the barrier */
- /* operation. */
unsigned long reschedule_jiffies;
unsigned long reschedule_count;
struct delayed_work work;
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 20f2373f7e25..97d1fe9a160c 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -248,9 +248,9 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
mutex_init(&ssp->srcu_sup->srcu_gp_mutex);
ssp->srcu_idx = 0;
ssp->srcu_sup->srcu_gp_seq = 0;
- ssp->srcu_barrier_seq = 0;
- mutex_init(&ssp->srcu_barrier_mutex);
- atomic_set(&ssp->srcu_barrier_cpu_cnt, 0);
+ ssp->srcu_sup->srcu_barrier_seq = 0;
+ mutex_init(&ssp->srcu_sup->srcu_barrier_mutex);
+ atomic_set(&ssp->srcu_sup->srcu_barrier_cpu_cnt, 0);
INIT_DELAYED_WORK(&ssp->work, process_srcu);
ssp->srcu_sup->sda_is_static = is_static;
if (!is_static)
@@ -1496,8 +1496,8 @@ static void srcu_barrier_cb(struct rcu_head *rhp)
sdp = container_of(rhp, struct srcu_data, srcu_barrier_head);
ssp = sdp->ssp;
- if (atomic_dec_and_test(&ssp->srcu_barrier_cpu_cnt))
- complete(&ssp->srcu_barrier_completion);
+ if (atomic_dec_and_test(&ssp->srcu_sup->srcu_barrier_cpu_cnt))
+ complete(&ssp->srcu_sup->srcu_barrier_completion);
}
/*
@@ -1511,13 +1511,13 @@ static void srcu_barrier_cb(struct rcu_head *rhp)
static void srcu_barrier_one_cpu(struct srcu_struct *ssp, struct srcu_data *sdp)
{
spin_lock_irq_rcu_node(sdp);
- atomic_inc(&ssp->srcu_barrier_cpu_cnt);
+ atomic_inc(&ssp->srcu_sup->srcu_barrier_cpu_cnt);
sdp->srcu_barrier_head.func = srcu_barrier_cb;
debug_rcu_head_queue(&sdp->srcu_barrier_head);
if (!rcu_segcblist_entrain(&sdp->srcu_cblist,
&sdp->srcu_barrier_head)) {
debug_rcu_head_unqueue(&sdp->srcu_barrier_head);
- atomic_dec(&ssp->srcu_barrier_cpu_cnt);
+ atomic_dec(&ssp->srcu_sup->srcu_barrier_cpu_cnt);
}
spin_unlock_irq_rcu_node(sdp);
}
@@ -1530,20 +1530,20 @@ void srcu_barrier(struct srcu_struct *ssp)
{
int cpu;
int idx;
- unsigned long s = rcu_seq_snap(&ssp->srcu_barrier_seq);
+ unsigned long s = rcu_seq_snap(&ssp->srcu_sup->srcu_barrier_seq);
check_init_srcu_struct(ssp);
- mutex_lock(&ssp->srcu_barrier_mutex);
- if (rcu_seq_done(&ssp->srcu_barrier_seq, s)) {
+ mutex_lock(&ssp->srcu_sup->srcu_barrier_mutex);
+ if (rcu_seq_done(&ssp->srcu_sup->srcu_barrier_seq, s)) {
smp_mb(); /* Force ordering following return. */
- mutex_unlock(&ssp->srcu_barrier_mutex);
+ mutex_unlock(&ssp->srcu_sup->srcu_barrier_mutex);
return; /* Someone else did our work for us. */
}
- rcu_seq_start(&ssp->srcu_barrier_seq);
- init_completion(&ssp->srcu_barrier_completion);
+ rcu_seq_start(&ssp->srcu_sup->srcu_barrier_seq);
+ init_completion(&ssp->srcu_sup->srcu_barrier_completion);
/* Initial count prevents reaching zero until all CBs are posted. */
- atomic_set(&ssp->srcu_barrier_cpu_cnt, 1);
+ atomic_set(&ssp->srcu_sup->srcu_barrier_cpu_cnt, 1);
idx = __srcu_read_lock_nmisafe(ssp);
if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
@@ -1554,12 +1554,12 @@ void srcu_barrier(struct srcu_struct *ssp)
__srcu_read_unlock_nmisafe(ssp, idx);
/* Remove the initial count, at which point reaching zero can happen. */
- if (atomic_dec_and_test(&ssp->srcu_barrier_cpu_cnt))
- complete(&ssp->srcu_barrier_completion);
- wait_for_completion(&ssp->srcu_barrier_completion);
+ if (atomic_dec_and_test(&ssp->srcu_sup->srcu_barrier_cpu_cnt))
+ complete(&ssp->srcu_sup->srcu_barrier_completion);
+ wait_for_completion(&ssp->srcu_sup->srcu_barrier_completion);
- rcu_seq_end(&ssp->srcu_barrier_seq);
- mutex_unlock(&ssp->srcu_barrier_mutex);
+ rcu_seq_end(&ssp->srcu_sup->srcu_barrier_seq);
+ mutex_unlock(&ssp->srcu_sup->srcu_barrier_mutex);
}
EXPORT_SYMBOL_GPL(srcu_barrier);
--
2.40.0.rc2
This commit moves the ->reschedule_jiffies, ->reschedule_count, and
->work fields from the srcu_struct structure to the srcu_usage structure
to reduce the size of the former in order to improve cache locality.
However, this means that the container_of() calls cannot get a pointer
to the srcu_struct because they are no longer in the srcu_struct.
This issue is addressed by adding a ->srcu_ssp field in the srcu_usage
structure that references the corresponding srcu_struct structure.
And given the presence of the sup pointer to the srcu_usage structure,
replace some ssp->srcu_usage-> instances with sup->.
[ paulmck Apply feedback from kernel test robot. ]
Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
include/linux/srcutree.h | 9 +++++----
kernel/rcu/srcutree.c | 41 +++++++++++++++++++++-------------------
2 files changed, 27 insertions(+), 23 deletions(-)
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index d544ec1c0c8e..cd0cdd8142c5 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -84,6 +84,10 @@ struct srcu_usage {
atomic_t srcu_barrier_cpu_cnt; /* # CPUs not yet posting a */
/* callback for the barrier */
/* operation. */
+ unsigned long reschedule_jiffies;
+ unsigned long reschedule_count;
+ struct delayed_work work;
+ struct srcu_struct *srcu_ssp;
};
/*
@@ -92,9 +96,6 @@ struct srcu_usage {
struct srcu_struct {
unsigned int srcu_idx; /* Current rdr array element. */
struct srcu_data __percpu *sda; /* Per-CPU srcu_data array. */
- unsigned long reschedule_jiffies;
- unsigned long reschedule_count;
- struct delayed_work work;
struct lockdep_map dep_map;
struct srcu_usage *srcu_sup; /* Update-side data. */
};
@@ -119,10 +120,10 @@ struct srcu_struct {
{ \
.lock = __SPIN_LOCK_UNLOCKED(name.lock), \
.srcu_gp_seq_needed = -1UL, \
+ .work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0), \
}
#define __SRCU_STRUCT_INIT_COMMON(name, usage_name) \
- .work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0), \
.srcu_sup = &usage_name, \
__SRCU_DEP_MAP_INIT(name)
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 97d1fe9a160c..169a6513b739 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -251,7 +251,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
ssp->srcu_sup->srcu_barrier_seq = 0;
mutex_init(&ssp->srcu_sup->srcu_barrier_mutex);
atomic_set(&ssp->srcu_sup->srcu_barrier_cpu_cnt, 0);
- INIT_DELAYED_WORK(&ssp->work, process_srcu);
+ INIT_DELAYED_WORK(&ssp->srcu_sup->work, process_srcu);
ssp->srcu_sup->sda_is_static = is_static;
if (!is_static)
ssp->sda = alloc_percpu(struct srcu_data);
@@ -275,6 +275,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG);
}
}
+ ssp->srcu_sup->srcu_ssp = ssp;
smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, 0); /* Init done. */
return 0;
}
@@ -647,7 +648,7 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
return; /* Just leak it! */
if (WARN_ON(srcu_readers_active(ssp)))
return; /* Just leak it! */
- flush_delayed_work(&ssp->work);
+ flush_delayed_work(&ssp->srcu_sup->work);
for_each_possible_cpu(cpu) {
struct srcu_data *sdp = per_cpu_ptr(ssp->sda, cpu);
@@ -1059,10 +1060,10 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
// can only be executed during early boot when there is only
// the one boot CPU running with interrupts still disabled.
if (likely(srcu_init_done))
- queue_delayed_work(rcu_gp_wq, &ssp->work,
+ queue_delayed_work(rcu_gp_wq, &ssp->srcu_sup->work,
!!srcu_get_delay(ssp));
- else if (list_empty(&ssp->work.work.entry))
- list_add(&ssp->work.work.entry, &srcu_boot_list);
+ else if (list_empty(&ssp->srcu_sup->work.work.entry))
+ list_add(&ssp->srcu_sup->work.work.entry, &srcu_boot_list);
}
spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
}
@@ -1723,7 +1724,7 @@ static void srcu_reschedule(struct srcu_struct *ssp, unsigned long delay)
spin_unlock_irq_rcu_node(ssp->srcu_sup);
if (pushgp)
- queue_delayed_work(rcu_gp_wq, &ssp->work, delay);
+ queue_delayed_work(rcu_gp_wq, &ssp->srcu_sup->work, delay);
}
/*
@@ -1734,22 +1735,24 @@ static void process_srcu(struct work_struct *work)
unsigned long curdelay;
unsigned long j;
struct srcu_struct *ssp;
+ struct srcu_usage *sup;
- ssp = container_of(work, struct srcu_struct, work.work);
+ sup = container_of(work, struct srcu_usage, work.work);
+ ssp = sup->srcu_ssp;
srcu_advance_state(ssp);
curdelay = srcu_get_delay(ssp);
if (curdelay) {
- WRITE_ONCE(ssp->reschedule_count, 0);
+ WRITE_ONCE(sup->reschedule_count, 0);
} else {
j = jiffies;
- if (READ_ONCE(ssp->reschedule_jiffies) == j) {
- WRITE_ONCE(ssp->reschedule_count, READ_ONCE(ssp->reschedule_count) + 1);
- if (READ_ONCE(ssp->reschedule_count) > srcu_max_nodelay)
+ if (READ_ONCE(sup->reschedule_jiffies) == j) {
+ WRITE_ONCE(sup->reschedule_count, READ_ONCE(sup->reschedule_count) + 1);
+ if (READ_ONCE(sup->reschedule_count) > srcu_max_nodelay)
curdelay = 1;
} else {
- WRITE_ONCE(ssp->reschedule_count, 1);
- WRITE_ONCE(ssp->reschedule_jiffies, j);
+ WRITE_ONCE(sup->reschedule_count, 1);
+ WRITE_ONCE(sup->reschedule_jiffies, j);
}
}
srcu_reschedule(ssp, curdelay);
@@ -1848,7 +1851,7 @@ early_initcall(srcu_bootup_announce);
void __init srcu_init(void)
{
- struct srcu_struct *ssp;
+ struct srcu_usage *sup;
/* Decide on srcu_struct-size strategy. */
if (SRCU_SIZING_IS(SRCU_SIZING_AUTO)) {
@@ -1868,13 +1871,13 @@ void __init srcu_init(void)
*/
srcu_init_done = true;
while (!list_empty(&srcu_boot_list)) {
- ssp = list_first_entry(&srcu_boot_list, struct srcu_struct,
+ sup = list_first_entry(&srcu_boot_list, struct srcu_usage,
work.work.entry);
- list_del_init(&ssp->work.work.entry);
+ list_del_init(&sup->work.work.entry);
if (SRCU_SIZING_IS(SRCU_SIZING_INIT) &&
- ssp->srcu_sup->srcu_size_state == SRCU_SIZE_SMALL)
- ssp->srcu_sup->srcu_size_state = SRCU_SIZE_ALLOC;
- queue_work(rcu_gp_wq, &ssp->work.work);
+ sup->srcu_size_state == SRCU_SIZE_SMALL)
+ sup->srcu_size_state = SRCU_SIZE_ALLOC;
+ queue_work(rcu_gp_wq, &sup->work.work);
}
}
--
2.40.0.rc2
This commit creates an srcu_usage pointer named "sup" as a shorter
synonym for the "ssp->srcu_sup" that was bloating several lines of code.
Cc: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/srcutree.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 9699bcab7215..11a08201ca0a 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -644,12 +644,13 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
void cleanup_srcu_struct(struct srcu_struct *ssp)
{
int cpu;
+ struct srcu_usage *sup = ssp->srcu_sup;
if (WARN_ON(!srcu_get_delay(ssp)))
return; /* Just leak it! */
if (WARN_ON(srcu_readers_active(ssp)))
return; /* Just leak it! */
- flush_delayed_work(&ssp->srcu_sup->work);
+ flush_delayed_work(&sup->work);
for_each_possible_cpu(cpu) {
struct srcu_data *sdp = per_cpu_ptr(ssp->sda, cpu);
@@ -658,21 +659,21 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
if (WARN_ON(rcu_segcblist_n_cbs(&sdp->srcu_cblist)))
return; /* Forgot srcu_barrier(), so just leak it! */
}
- if (WARN_ON(rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
- WARN_ON(rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq) != ssp->srcu_sup->srcu_gp_seq_needed) ||
+ if (WARN_ON(rcu_seq_state(READ_ONCE(sup->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
+ WARN_ON(rcu_seq_current(&sup->srcu_gp_seq) != sup->srcu_gp_seq_needed) ||
WARN_ON(srcu_readers_active(ssp))) {
pr_info("%s: Active srcu_struct %p read state: %d gp state: %lu/%lu\n",
- __func__, ssp, rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)),
- rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq), ssp->srcu_sup->srcu_gp_seq_needed);
+ __func__, ssp, rcu_seq_state(READ_ONCE(sup->srcu_gp_seq)),
+ rcu_seq_current(&sup->srcu_gp_seq), sup->srcu_gp_seq_needed);
return; /* Caller forgot to stop doing call_srcu()? */
}
- kfree(ssp->srcu_sup->node);
- ssp->srcu_sup->node = NULL;
- ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL;
- if (!ssp->srcu_sup->sda_is_static) {
+ kfree(sup->node);
+ sup->node = NULL;
+ sup->srcu_size_state = SRCU_SIZE_SMALL;
+ if (!sup->sda_is_static) {
free_percpu(ssp->sda);
ssp->sda = NULL;
- kfree(ssp->srcu_sup);
+ kfree(sup);
ssp->srcu_sup = NULL;
}
}
--
2.40.0.rc2
This commit creates an srcu_usage pointer named "sup" as a shorter
synonym for the "ssp->srcu_sup" that was bloating several lines of code.
Signed-off-by: Paul E. McKenney <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Cc: Christoph Hellwig <[email protected]>
---
kernel/rcu/srcutree.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index f9dd6ed5503e..9699bcab7215 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -616,17 +616,18 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
unsigned long gpstart;
unsigned long j;
unsigned long jbase = SRCU_INTERVAL;
+ struct srcu_usage *sup = ssp->srcu_sup;
- if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_sup->srcu_gp_seq), READ_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp)))
+ if (ULONG_CMP_LT(READ_ONCE(sup->srcu_gp_seq), READ_ONCE(sup->srcu_gp_seq_needed_exp)))
jbase = 0;
- if (rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq))) {
+ if (rcu_seq_state(READ_ONCE(sup->srcu_gp_seq))) {
j = jiffies - 1;
- gpstart = READ_ONCE(ssp->srcu_sup->srcu_gp_start);
+ gpstart = READ_ONCE(sup->srcu_gp_start);
if (time_after(j, gpstart))
jbase += j - gpstart;
if (!jbase) {
- WRITE_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay, READ_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay) + 1);
- if (READ_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay) > srcu_max_nodelay_phase)
+ WRITE_ONCE(sup->srcu_n_exp_nodelay, READ_ONCE(sup->srcu_n_exp_nodelay) + 1);
+ if (READ_ONCE(sup->srcu_n_exp_nodelay) > srcu_max_nodelay_phase)
jbase = 1;
}
}
--
2.40.0.rc2
This commit creates an srcu_usage pointer named "sup" as a shorter
synonym for the "ssp->srcu_sup" that was bloating several lines of code.
Cc: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/srcutree.c | 27 ++++++++++++++-------------
1 file changed, 14 insertions(+), 13 deletions(-)
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index f661a0f6bc0d..a887cfc89894 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1004,9 +1004,10 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
struct srcu_node *snp;
struct srcu_node *snp_leaf;
unsigned long snp_seq;
+ struct srcu_usage *sup = ssp->srcu_sup;
/* Ensure that snp node tree is fully initialized before traversing it */
- if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
+ if (smp_load_acquire(&sup->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
snp_leaf = NULL;
else
snp_leaf = sdp->mynode;
@@ -1014,7 +1015,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
if (snp_leaf)
/* Each pass through the loop does one level of the srcu_node tree. */
for (snp = snp_leaf; snp != NULL; snp = snp->srcu_parent) {
- if (WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, s)) && snp != snp_leaf)
+ if (WARN_ON_ONCE(rcu_seq_done(&sup->srcu_gp_seq, s)) && snp != snp_leaf)
return; /* GP already done and CBs recorded. */
spin_lock_irqsave_rcu_node(snp, flags);
snp_seq = snp->srcu_have_cbs[idx];
@@ -1041,20 +1042,20 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
/* Top of tree, must ensure the grace period will be started. */
spin_lock_irqsave_ssp_contention(ssp, &flags);
- if (ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed, s)) {
+ if (ULONG_CMP_LT(sup->srcu_gp_seq_needed, s)) {
/*
* Record need for grace period s. Pair with load
* acquire setting up for initialization.
*/
- smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, s); /*^^^*/
+ smp_store_release(&sup->srcu_gp_seq_needed, s); /*^^^*/
}
- if (!do_norm && ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed_exp, s))
- WRITE_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp, s);
+ if (!do_norm && ULONG_CMP_LT(sup->srcu_gp_seq_needed_exp, s))
+ WRITE_ONCE(sup->srcu_gp_seq_needed_exp, s);
/* If grace period not already in progress, start it. */
- if (!WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, s)) &&
- rcu_seq_state(ssp->srcu_sup->srcu_gp_seq) == SRCU_STATE_IDLE) {
- WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_sup->srcu_gp_seq, ssp->srcu_sup->srcu_gp_seq_needed));
+ if (!WARN_ON_ONCE(rcu_seq_done(&sup->srcu_gp_seq, s)) &&
+ rcu_seq_state(sup->srcu_gp_seq) == SRCU_STATE_IDLE) {
+ WARN_ON_ONCE(ULONG_CMP_GE(sup->srcu_gp_seq, sup->srcu_gp_seq_needed));
srcu_gp_start(ssp);
// And how can that list_add() in the "else" clause
@@ -1063,12 +1064,12 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
// can only be executed during early boot when there is only
// the one boot CPU running with interrupts still disabled.
if (likely(srcu_init_done))
- queue_delayed_work(rcu_gp_wq, &ssp->srcu_sup->work,
+ queue_delayed_work(rcu_gp_wq, &sup->work,
!!srcu_get_delay(ssp));
- else if (list_empty(&ssp->srcu_sup->work.work.entry))
- list_add(&ssp->srcu_sup->work.work.entry, &srcu_boot_list);
+ else if (list_empty(&sup->work.work.entry))
+ list_add(&sup->work.work.entry, &srcu_boot_list);
}
- spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
+ spin_unlock_irqrestore_rcu_node(sup, flags);
}
/*
--
2.40.0.rc2
This commit moves the ->srcu_gp_seq, ->srcu_gp_seq_needed,
->srcu_gp_seq_needed_exp, ->srcu_gp_start, and ->srcu_last_gp_end fields
from the srcu_struct structure to the srcu_usage structure to reduce
the size of the former in order to improve cache locality.
Suggested-by: Christoph Hellwig <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: "Zhang, Qiang1" <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
include/linux/srcutree.h | 25 ++++----
kernel/rcu/srcutree.c | 128 +++++++++++++++++++--------------------
2 files changed, 77 insertions(+), 76 deletions(-)
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index d04e3da6181c..372e35b0e8b6 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -68,6 +68,11 @@ struct srcu_usage {
struct mutex srcu_cb_mutex; /* Serialize CB preparation. */
spinlock_t __private lock; /* Protect counters and size state. */
struct mutex srcu_gp_mutex; /* Serialize GP work. */
+ unsigned long srcu_gp_seq; /* Grace-period seq #. */
+ unsigned long srcu_gp_seq_needed; /* Latest gp_seq needed. */
+ unsigned long srcu_gp_seq_needed_exp; /* Furthest future exp GP. */
+ unsigned long srcu_gp_start; /* Last GP start timestamp (jiffies) */
+ unsigned long srcu_last_gp_end; /* Last GP end timestamp (ns) */
};
/*
@@ -75,11 +80,6 @@ struct srcu_usage {
*/
struct srcu_struct {
unsigned int srcu_idx; /* Current rdr array element. */
- unsigned long srcu_gp_seq; /* Grace-period seq #. */
- unsigned long srcu_gp_seq_needed; /* Latest gp_seq needed. */
- unsigned long srcu_gp_seq_needed_exp; /* Furthest future exp GP. */
- unsigned long srcu_gp_start; /* Last GP start timestamp (jiffies) */
- unsigned long srcu_last_gp_end; /* Last GP end timestamp (ns) */
unsigned long srcu_size_jiffies; /* Current contention-measurement interval. */
unsigned long srcu_n_lock_retries; /* Contention events in current interval. */
unsigned long srcu_n_exp_nodelay; /* # expedited no-delays in current GP phase. */
@@ -115,8 +115,13 @@ struct srcu_struct {
#define SRCU_STATE_SCAN1 1
#define SRCU_STATE_SCAN2 2
-#define __SRCU_STRUCT_INIT_COMMON(name, usage_name) \
+#define __SRCU_USAGE_INIT(name) \
+{ \
+ .lock = __SPIN_LOCK_UNLOCKED(name.lock), \
.srcu_gp_seq_needed = -1UL, \
+}
+
+#define __SRCU_STRUCT_INIT_COMMON(name, usage_name) \
.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0), \
.srcu_sup = &usage_name, \
__SRCU_DEP_MAP_INIT(name)
@@ -153,9 +158,7 @@ struct srcu_struct {
*/
#ifdef MODULE
# define __DEFINE_SRCU(name, is_static) \
- static struct srcu_usage name##_srcu_usage = { \
- .lock = __SPIN_LOCK_UNLOCKED(name.lock), \
- }; \
+ static struct srcu_usage name##_srcu_usage = __SRCU_USAGE_INIT(name##_srcu_usage); \
is_static struct srcu_struct name = __SRCU_STRUCT_INIT_MODULE(name, name##_srcu_usage); \
extern struct srcu_struct * const __srcu_struct_##name; \
struct srcu_struct * const __srcu_struct_##name \
@@ -163,9 +166,7 @@ struct srcu_struct {
#else
# define __DEFINE_SRCU(name, is_static) \
static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data); \
- static struct srcu_usage name##_srcu_usage = { \
- .lock = __SPIN_LOCK_UNLOCKED(name.lock), \
- }; \
+ static struct srcu_usage name##_srcu_usage = __SRCU_USAGE_INIT(name##_srcu_usage); \
is_static struct srcu_struct name = \
__SRCU_STRUCT_INIT(name, name##_srcu_usage, name##_srcu_data)
#endif
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index a36066798de7..340eb685cf64 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -135,8 +135,8 @@ static void init_srcu_struct_data(struct srcu_struct *ssp)
spin_lock_init(&ACCESS_PRIVATE(sdp, lock));
rcu_segcblist_init(&sdp->srcu_cblist);
sdp->srcu_cblist_invoking = false;
- sdp->srcu_gp_seq_needed = ssp->srcu_gp_seq;
- sdp->srcu_gp_seq_needed_exp = ssp->srcu_gp_seq;
+ sdp->srcu_gp_seq_needed = ssp->srcu_sup->srcu_gp_seq;
+ sdp->srcu_gp_seq_needed_exp = ssp->srcu_sup->srcu_gp_seq;
sdp->mynode = NULL;
sdp->cpu = cpu;
INIT_WORK(&sdp->work, srcu_invoke_callbacks);
@@ -247,7 +247,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
mutex_init(&ssp->srcu_sup->srcu_cb_mutex);
mutex_init(&ssp->srcu_sup->srcu_gp_mutex);
ssp->srcu_idx = 0;
- ssp->srcu_gp_seq = 0;
+ ssp->srcu_sup->srcu_gp_seq = 0;
ssp->srcu_barrier_seq = 0;
mutex_init(&ssp->srcu_barrier_mutex);
atomic_set(&ssp->srcu_barrier_cpu_cnt, 0);
@@ -261,8 +261,8 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
return -ENOMEM;
}
init_srcu_struct_data(ssp);
- ssp->srcu_gp_seq_needed_exp = 0;
- ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
+ ssp->srcu_sup->srcu_gp_seq_needed_exp = 0;
+ ssp->srcu_sup->srcu_last_gp_end = ktime_get_mono_fast_ns();
if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) {
if (!ssp->sda_is_static) {
@@ -275,7 +275,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG);
}
}
- smp_store_release(&ssp->srcu_gp_seq_needed, 0); /* Init done. */
+ smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, 0); /* Init done. */
return 0;
}
@@ -402,10 +402,10 @@ static void check_init_srcu_struct(struct srcu_struct *ssp)
unsigned long flags;
/* The smp_load_acquire() pairs with the smp_store_release(). */
- if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_gp_seq_needed))) /*^^^*/
+ if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed))) /*^^^*/
return; /* Already initialized. */
spin_lock_irqsave_rcu_node(ssp->srcu_sup, flags);
- if (!rcu_seq_state(ssp->srcu_gp_seq_needed)) {
+ if (!rcu_seq_state(ssp->srcu_sup->srcu_gp_seq_needed)) {
spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
return;
}
@@ -616,11 +616,11 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
unsigned long j;
unsigned long jbase = SRCU_INTERVAL;
- if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq), READ_ONCE(ssp->srcu_gp_seq_needed_exp)))
+ if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_sup->srcu_gp_seq), READ_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp)))
jbase = 0;
- if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq))) {
+ if (rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq))) {
j = jiffies - 1;
- gpstart = READ_ONCE(ssp->srcu_gp_start);
+ gpstart = READ_ONCE(ssp->srcu_sup->srcu_gp_start);
if (time_after(j, gpstart))
jbase += j - gpstart;
if (!jbase) {
@@ -656,12 +656,12 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
if (WARN_ON(rcu_segcblist_n_cbs(&sdp->srcu_cblist)))
return; /* Forgot srcu_barrier(), so just leak it! */
}
- if (WARN_ON(rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
- WARN_ON(rcu_seq_current(&ssp->srcu_gp_seq) != ssp->srcu_gp_seq_needed) ||
+ if (WARN_ON(rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)) != SRCU_STATE_IDLE) ||
+ WARN_ON(rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq) != ssp->srcu_sup->srcu_gp_seq_needed) ||
WARN_ON(srcu_readers_active(ssp))) {
pr_info("%s: Active srcu_struct %p read state: %d gp state: %lu/%lu\n",
- __func__, ssp, rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)),
- rcu_seq_current(&ssp->srcu_gp_seq), ssp->srcu_gp_seq_needed);
+ __func__, ssp, rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)),
+ rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq), ssp->srcu_sup->srcu_gp_seq_needed);
return; /* Caller forgot to stop doing call_srcu()? */
}
kfree(ssp->srcu_sup->node);
@@ -775,18 +775,18 @@ static void srcu_gp_start(struct srcu_struct *ssp)
else
sdp = this_cpu_ptr(ssp->sda);
lockdep_assert_held(&ACCESS_PRIVATE(ssp->srcu_sup, lock));
- WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed));
+ WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_sup->srcu_gp_seq, ssp->srcu_sup->srcu_gp_seq_needed));
spin_lock_rcu_node(sdp); /* Interrupts already disabled. */
rcu_segcblist_advance(&sdp->srcu_cblist,
- rcu_seq_current(&ssp->srcu_gp_seq));
+ rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
(void)rcu_segcblist_accelerate(&sdp->srcu_cblist,
- rcu_seq_snap(&ssp->srcu_gp_seq));
+ rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq));
spin_unlock_rcu_node(sdp); /* Interrupts remain disabled. */
- WRITE_ONCE(ssp->srcu_gp_start, jiffies);
+ WRITE_ONCE(ssp->srcu_sup->srcu_gp_start, jiffies);
WRITE_ONCE(ssp->srcu_n_exp_nodelay, 0);
smp_mb(); /* Order prior store to ->srcu_gp_seq_needed vs. GP start. */
- rcu_seq_start(&ssp->srcu_gp_seq);
- state = rcu_seq_state(ssp->srcu_gp_seq);
+ rcu_seq_start(&ssp->srcu_sup->srcu_gp_seq);
+ state = rcu_seq_state(ssp->srcu_sup->srcu_gp_seq);
WARN_ON_ONCE(state != SRCU_STATE_SCAN1);
}
@@ -865,16 +865,16 @@ static void srcu_gp_end(struct srcu_struct *ssp)
/* End the current grace period. */
spin_lock_irq_rcu_node(ssp->srcu_sup);
- idx = rcu_seq_state(ssp->srcu_gp_seq);
+ idx = rcu_seq_state(ssp->srcu_sup->srcu_gp_seq);
WARN_ON_ONCE(idx != SRCU_STATE_SCAN2);
- if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq), READ_ONCE(ssp->srcu_gp_seq_needed_exp)))
+ if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_sup->srcu_gp_seq), READ_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp)))
cbdelay = 0;
- WRITE_ONCE(ssp->srcu_last_gp_end, ktime_get_mono_fast_ns());
- rcu_seq_end(&ssp->srcu_gp_seq);
- gpseq = rcu_seq_current(&ssp->srcu_gp_seq);
- if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, gpseq))
- WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, gpseq);
+ WRITE_ONCE(ssp->srcu_sup->srcu_last_gp_end, ktime_get_mono_fast_ns());
+ rcu_seq_end(&ssp->srcu_sup->srcu_gp_seq);
+ gpseq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
+ if (ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed_exp, gpseq))
+ WRITE_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp, gpseq);
spin_unlock_irq_rcu_node(ssp->srcu_sup);
mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
/* A new grace period can start at this point. But only one. */
@@ -925,9 +925,9 @@ static void srcu_gp_end(struct srcu_struct *ssp)
/* Start a new grace period if needed. */
spin_lock_irq_rcu_node(ssp->srcu_sup);
- gpseq = rcu_seq_current(&ssp->srcu_gp_seq);
+ gpseq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
if (!rcu_seq_state(gpseq) &&
- ULONG_CMP_LT(gpseq, ssp->srcu_gp_seq_needed)) {
+ ULONG_CMP_LT(gpseq, ssp->srcu_sup->srcu_gp_seq_needed)) {
srcu_gp_start(ssp);
spin_unlock_irq_rcu_node(ssp->srcu_sup);
srcu_reschedule(ssp, 0);
@@ -960,7 +960,7 @@ static void srcu_funnel_exp_start(struct srcu_struct *ssp, struct srcu_node *snp
if (snp)
for (; snp != NULL; snp = snp->srcu_parent) {
sgsne = READ_ONCE(snp->srcu_gp_seq_needed_exp);
- if (WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_gp_seq, s)) ||
+ if (WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, s)) ||
(!srcu_invl_snp_seq(sgsne) && ULONG_CMP_GE(sgsne, s)))
return;
spin_lock_irqsave_rcu_node(snp, flags);
@@ -973,8 +973,8 @@ static void srcu_funnel_exp_start(struct srcu_struct *ssp, struct srcu_node *snp
spin_unlock_irqrestore_rcu_node(snp, flags);
}
spin_lock_irqsave_ssp_contention(ssp, &flags);
- if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, s))
- WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, s);
+ if (ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed_exp, s))
+ WRITE_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp, s);
spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags);
}
@@ -1010,7 +1010,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
if (snp_leaf)
/* Each pass through the loop does one level of the srcu_node tree. */
for (snp = snp_leaf; snp != NULL; snp = snp->srcu_parent) {
- if (WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_gp_seq, s)) && snp != snp_leaf)
+ if (WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, s)) && snp != snp_leaf)
return; /* GP already done and CBs recorded. */
spin_lock_irqsave_rcu_node(snp, flags);
snp_seq = snp->srcu_have_cbs[idx];
@@ -1037,20 +1037,20 @@ static void srcu_funnel_gp_start(struct srcu_struct *ssp, struct srcu_data *sdp,
/* Top of tree, must ensure the grace period will be started. */
spin_lock_irqsave_ssp_contention(ssp, &flags);
- if (ULONG_CMP_LT(ssp->srcu_gp_seq_needed, s)) {
+ if (ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed, s)) {
/*
* Record need for grace period s. Pair with load
* acquire setting up for initialization.
*/
- smp_store_release(&ssp->srcu_gp_seq_needed, s); /*^^^*/
+ smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, s); /*^^^*/
}
- if (!do_norm && ULONG_CMP_LT(ssp->srcu_gp_seq_needed_exp, s))
- WRITE_ONCE(ssp->srcu_gp_seq_needed_exp, s);
+ if (!do_norm && ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed_exp, s))
+ WRITE_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp, s);
/* If grace period not already in progress, start it. */
- if (!WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_gp_seq, s)) &&
- rcu_seq_state(ssp->srcu_gp_seq) == SRCU_STATE_IDLE) {
- WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed));
+ if (!WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, s)) &&
+ rcu_seq_state(ssp->srcu_sup->srcu_gp_seq) == SRCU_STATE_IDLE) {
+ WARN_ON_ONCE(ULONG_CMP_GE(ssp->srcu_sup->srcu_gp_seq, ssp->srcu_sup->srcu_gp_seq_needed));
srcu_gp_start(ssp);
// And how can that list_add() in the "else" clause
@@ -1164,18 +1164,18 @@ static bool srcu_might_be_idle(struct srcu_struct *ssp)
/* First, see if enough time has passed since the last GP. */
t = ktime_get_mono_fast_ns();
- tlast = READ_ONCE(ssp->srcu_last_gp_end);
+ tlast = READ_ONCE(ssp->srcu_sup->srcu_last_gp_end);
if (exp_holdoff == 0 ||
time_in_range_open(t, tlast, tlast + exp_holdoff))
return false; /* Too soon after last GP. */
/* Next, check for probable idleness. */
- curseq = rcu_seq_current(&ssp->srcu_gp_seq);
+ curseq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
smp_mb(); /* Order ->srcu_gp_seq with ->srcu_gp_seq_needed. */
- if (ULONG_CMP_LT(curseq, READ_ONCE(ssp->srcu_gp_seq_needed)))
+ if (ULONG_CMP_LT(curseq, READ_ONCE(ssp->srcu_sup->srcu_gp_seq_needed)))
return false; /* Grace period in progress, so not idle. */
smp_mb(); /* Order ->srcu_gp_seq with prior access. */
- if (curseq != rcu_seq_current(&ssp->srcu_gp_seq))
+ if (curseq != rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq))
return false; /* GP # changed, so not idle. */
return true; /* With reasonable probability, idle! */
}
@@ -1218,8 +1218,8 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
if (rhp)
rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp);
rcu_segcblist_advance(&sdp->srcu_cblist,
- rcu_seq_current(&ssp->srcu_gp_seq));
- s = rcu_seq_snap(&ssp->srcu_gp_seq);
+ rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
+ s = rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq);
(void)rcu_segcblist_accelerate(&sdp->srcu_cblist, s);
if (ULONG_CMP_LT(sdp->srcu_gp_seq_needed, s)) {
sdp->srcu_gp_seq_needed = s;
@@ -1430,7 +1430,7 @@ unsigned long get_state_synchronize_srcu(struct srcu_struct *ssp)
// Any prior manipulation of SRCU-protected data must happen
// before the load from ->srcu_gp_seq.
smp_mb();
- return rcu_seq_snap(&ssp->srcu_gp_seq);
+ return rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq);
}
EXPORT_SYMBOL_GPL(get_state_synchronize_srcu);
@@ -1477,7 +1477,7 @@ EXPORT_SYMBOL_GPL(start_poll_synchronize_srcu);
*/
bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie)
{
- if (!rcu_seq_done(&ssp->srcu_gp_seq, cookie))
+ if (!rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, cookie))
return false;
// Ensure that the end of the SRCU grace period happens before
// any subsequent code that the caller might execute.
@@ -1597,16 +1597,16 @@ static void srcu_advance_state(struct srcu_struct *ssp)
* The load-acquire ensures that we see the accesses performed
* by the prior grace period.
*/
- idx = rcu_seq_state(smp_load_acquire(&ssp->srcu_gp_seq)); /* ^^^ */
+ idx = rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq)); /* ^^^ */
if (idx == SRCU_STATE_IDLE) {
spin_lock_irq_rcu_node(ssp->srcu_sup);
- if (ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed)) {
- WARN_ON_ONCE(rcu_seq_state(ssp->srcu_gp_seq));
+ if (ULONG_CMP_GE(ssp->srcu_sup->srcu_gp_seq, ssp->srcu_sup->srcu_gp_seq_needed)) {
+ WARN_ON_ONCE(rcu_seq_state(ssp->srcu_sup->srcu_gp_seq));
spin_unlock_irq_rcu_node(ssp->srcu_sup);
mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
return;
}
- idx = rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq));
+ idx = rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq));
if (idx == SRCU_STATE_IDLE)
srcu_gp_start(ssp);
spin_unlock_irq_rcu_node(ssp->srcu_sup);
@@ -1616,7 +1616,7 @@ static void srcu_advance_state(struct srcu_struct *ssp)
}
}
- if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) == SRCU_STATE_SCAN1) {
+ if (rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)) == SRCU_STATE_SCAN1) {
idx = 1 ^ (ssp->srcu_idx & 1);
if (!try_check_zero(ssp, idx, 1)) {
mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex);
@@ -1624,12 +1624,12 @@ static void srcu_advance_state(struct srcu_struct *ssp)
}
srcu_flip(ssp);
spin_lock_irq_rcu_node(ssp->srcu_sup);
- rcu_seq_set_state(&ssp->srcu_gp_seq, SRCU_STATE_SCAN2);
+ rcu_seq_set_state(&ssp->srcu_sup->srcu_gp_seq, SRCU_STATE_SCAN2);
ssp->srcu_n_exp_nodelay = 0;
spin_unlock_irq_rcu_node(ssp->srcu_sup);
}
- if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) == SRCU_STATE_SCAN2) {
+ if (rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)) == SRCU_STATE_SCAN2) {
/*
* SRCU read-side critical sections are normally short,
@@ -1666,7 +1666,7 @@ static void srcu_invoke_callbacks(struct work_struct *work)
rcu_cblist_init(&ready_cbs);
spin_lock_irq_rcu_node(sdp);
rcu_segcblist_advance(&sdp->srcu_cblist,
- rcu_seq_current(&ssp->srcu_gp_seq));
+ rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
if (sdp->srcu_cblist_invoking ||
!rcu_segcblist_ready_cbs(&sdp->srcu_cblist)) {
spin_unlock_irq_rcu_node(sdp);
@@ -1694,7 +1694,7 @@ static void srcu_invoke_callbacks(struct work_struct *work)
spin_lock_irq_rcu_node(sdp);
rcu_segcblist_add_len(&sdp->srcu_cblist, -len);
(void)rcu_segcblist_accelerate(&sdp->srcu_cblist,
- rcu_seq_snap(&ssp->srcu_gp_seq));
+ rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq));
sdp->srcu_cblist_invoking = false;
more = rcu_segcblist_ready_cbs(&sdp->srcu_cblist);
spin_unlock_irq_rcu_node(sdp);
@@ -1711,12 +1711,12 @@ static void srcu_reschedule(struct srcu_struct *ssp, unsigned long delay)
bool pushgp = true;
spin_lock_irq_rcu_node(ssp->srcu_sup);
- if (ULONG_CMP_GE(ssp->srcu_gp_seq, ssp->srcu_gp_seq_needed)) {
- if (!WARN_ON_ONCE(rcu_seq_state(ssp->srcu_gp_seq))) {
+ if (ULONG_CMP_GE(ssp->srcu_sup->srcu_gp_seq, ssp->srcu_sup->srcu_gp_seq_needed)) {
+ if (!WARN_ON_ONCE(rcu_seq_state(ssp->srcu_sup->srcu_gp_seq))) {
/* All requests fulfilled, time to go idle. */
pushgp = false;
}
- } else if (!rcu_seq_state(ssp->srcu_gp_seq)) {
+ } else if (!rcu_seq_state(ssp->srcu_sup->srcu_gp_seq)) {
/* Outstanding request and no GP. Start one. */
srcu_gp_start(ssp);
}
@@ -1762,7 +1762,7 @@ void srcutorture_get_gp_data(enum rcutorture_type test_type,
if (test_type != SRCU_FLAVOR)
return;
*flags = 0;
- *gp_seq = rcu_seq_current(&ssp->srcu_gp_seq);
+ *gp_seq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
}
EXPORT_SYMBOL_GPL(srcutorture_get_gp_data);
@@ -1791,7 +1791,7 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf)
if (ss_state < 0 || ss_state >= ARRAY_SIZE(srcu_size_state_name))
ss_state_idx = ARRAY_SIZE(srcu_size_state_name) - 1;
pr_alert("%s%s Tree SRCU g%ld state %d (%s)",
- tt, tf, rcu_seq_current(&ssp->srcu_gp_seq), ss_state,
+ tt, tf, rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq), ss_state,
srcu_size_state_name[ss_state_idx]);
if (!ssp->sda) {
// Called after cleanup_srcu_struct(), perhaps.
@@ -1905,7 +1905,7 @@ static void srcu_module_going(struct module *mod)
for (i = 0; i < mod->num_srcu_structs; i++) {
ssp = *(sspp++);
- if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_gp_seq_needed)) &&
+ if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
!WARN_ON_ONCE(!ssp->sda_is_static))
cleanup_srcu_struct(ssp);
free_percpu(ssp->sda);
--
2.40.0.rc2
On Thu, Mar 30, 2023 at 03:47:07PM -0700, Paul E. McKenney wrote:
> The tasks_rcu_exit_srcu variable is used only by kernels built
> with CONFIG_TASKS_RCU=y, but is defined for all kernesl with
> CONFIG_TASKS_RCU_GENERIC=y. Therefore, in kernels built with
> CONFIG_TASKS_RCU_GENERIC=y but CONFIG_TASKS_RCU=n, this gives
> a "defined but not used" warning.
>
> This commit therefore moves this variable under CONFIG_TASKS_RCU.
>
> Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/
> Reported-by: kernel test robot <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>
> ---
> kernel/rcu/tasks.h | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
> index bfb5e1549f2b..185358c2f08d 100644
> --- a/kernel/rcu/tasks.h
> +++ b/kernel/rcu/tasks.h
> @@ -136,8 +136,10 @@ static struct rcu_tasks rt_name = \
> .kname = #rt_name, \
> }
>
> +#ifdef CONFIG_TASKS_RCU
> /* Track exiting tasks in order to allow them to be waited for. */
> DEFINE_STATIC_SRCU(tasks_rcu_exit_srcu);
> +#endif
Reviewed-by: Frederic Weisbecker <[email protected]>
BTW should exit_tasks_rcu_start/stop() be defined as static inline stubs
whenever !CONFIG_TASKS_RCU ? Currently:
* if CONFIG_TASKS_RCU_GENERIC=n, the stubs are as usual (static inline)
* if CONFIG_TASKS_RCU_GENERIC=y && CONFIG_TASKS_RCU=n, then the stubs are
defined as empty linked function (is the compiler smart enough to remove
the empty call?)
On Fri, Mar 31, 2023 at 01:58:38PM +0200, Frederic Weisbecker wrote:
> On Thu, Mar 30, 2023 at 03:47:07PM -0700, Paul E. McKenney wrote:
> > The tasks_rcu_exit_srcu variable is used only by kernels built
> > with CONFIG_TASKS_RCU=y, but is defined for all kernesl with
> > CONFIG_TASKS_RCU_GENERIC=y. Therefore, in kernels built with
> > CONFIG_TASKS_RCU_GENERIC=y but CONFIG_TASKS_RCU=n, this gives
> > a "defined but not used" warning.
> >
> > This commit therefore moves this variable under CONFIG_TASKS_RCU.
> >
> > Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/
> > Reported-by: kernel test robot <[email protected]>
> > Signed-off-by: Paul E. McKenney <[email protected]>
> > ---
> > kernel/rcu/tasks.h | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
> > index bfb5e1549f2b..185358c2f08d 100644
> > --- a/kernel/rcu/tasks.h
> > +++ b/kernel/rcu/tasks.h
> > @@ -136,8 +136,10 @@ static struct rcu_tasks rt_name = \
> > .kname = #rt_name, \
> > }
> >
> > +#ifdef CONFIG_TASKS_RCU
> > /* Track exiting tasks in order to allow them to be waited for. */
> > DEFINE_STATIC_SRCU(tasks_rcu_exit_srcu);
> > +#endif
>
> Reviewed-by: Frederic Weisbecker <[email protected]>
Thank you! I will apply this on my next rebase.
> BTW should exit_tasks_rcu_start/stop() be defined as static inline stubs
> whenever !CONFIG_TASKS_RCU ? Currently:
>
> * if CONFIG_TASKS_RCU_GENERIC=n, the stubs are as usual (static inline)
>
> * if CONFIG_TASKS_RCU_GENERIC=y && CONFIG_TASKS_RCU=n, then the stubs are
> defined as empty linked function (is the compiler smart enough to remove
> the empty call?)
Good point! They are currently defined as static inline only if there
are no RCU Tasks flavors built into the kernel, so if there was (say)
RCU Tasks Trace but no RCU Tasks, these functions would needlessly be
actual calls to empty functions.
I am not sure that there are any kernels built like this, and this is not
on a fastpath, but I would welcome a patch that made this more precise.
Testing will require a few kernel builds, though. ;-)
Thanx, Paul
On Thu, Mar 30, 2023 at 03:47:10PM -0700, Paul E. McKenney wrote:
> The current srcu_struct structure is on the order of 200 bytes in size
> (depending on architecture and .config), which is much better than the
> old-style 26K bytes, but still all too inconvenient when one is trying
> to achieve good cache locality on a fastpath involving SRCU readers.
>
> However, only a few fields in srcu_struct are used by SRCU readers.
> The remaining fields could be offloaded to a new srcu_update
> structure, thus shrinking the srcu_struct structure down to a few
> tens of bytes. This commit begins this noble quest, a quest that is
> complicated by open-coded initialization of the srcu_struct within the
> srcu_notifier_head structure. This complication is addressed by updating
> the srcu_notifier_head structure's open coding, given that there does
> not appear to be a straightforward way of abstracting that initialization.
>
> This commit moves only the ->node pointer to srcu_update. Later commits
> will move additional fields.
>
> [ paulmck: Fold in [email protected]'s memory-leak fix. ]
>
> Link: https://lore.kernel.org/all/[email protected]/
> Suggested-by: Christoph Hellwig <[email protected]>
[..]
> @@ -236,8 +236,12 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
> */
> static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
> {
> + if (!is_static)
> + ssp->srcu_sup = kzalloc(sizeof(*ssp->srcu_sup), GFP_KERNEL);
> + if (!ssp->srcu_sup)
> + return -ENOMEM;
> ssp->srcu_size_state = SRCU_SIZE_SMALL;
> - ssp->node = NULL;
> + ssp->srcu_sup->node = NULL;
> mutex_init(&ssp->srcu_cb_mutex);
> mutex_init(&ssp->srcu_gp_mutex);
> ssp->srcu_idx = 0;
> @@ -249,8 +253,11 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
> ssp->sda_is_static = is_static;
> if (!is_static)
> ssp->sda = alloc_percpu(struct srcu_data);
> - if (!ssp->sda)
> + if (!ssp->sda) {
> + if (!is_static)
> + kfree(ssp->srcu_sup);
> return -ENOMEM;
> + }
> init_srcu_struct_data(ssp);
> ssp->srcu_gp_seq_needed_exp = 0;
> ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
> @@ -259,6 +266,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
[1] Here there is an if (!init_srcu_struct_nodes(...)) that the diff does not show.
> if (!ssp->sda_is_static) {
> free_percpu(ssp->sda);
> ssp->sda = NULL;
> + kfree(ssp->srcu_sup);
> return -ENOMEM;
> }
> } else {
Just a comment about the original code with reference to [1].
Here if allocations in init_srcu_struct_nodes() fail, it will return false
and execute the "if (!ssp->sda_is_is_static)" bit.
So if the allocation in [1] fails, then if sda_is_static is true, we return
-ENOMEM, however if sda_is_static is false, we do the following:
ssp->srcu_sup->srcu_ssp = ssp;
smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, 0); /* Init done. */
return 0;
Is that really correct?
In other words, if init_srcu_struct_nodes() returns false, then passing along
the return value of init_srcu_struct_nodes() back to the caller of
init_srcu_struct_fields() depends on whether is_static = true or false. That
seems a bit wrong to me, init_srcu_struct_fields() should always return
-ENOMEM when init_srcu_struct_nodes() fails to allocate memory IMHO, whether
is_static is true or not.
Sorry if I missed something subtle, and if the code is correct to begin with.
Also I feel the return paths could be made better to also fix the above issue
I mentioned. How about the following diff on top of the series, would it
work?
Thanks!
---8<-----------------------
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index a887cfc89894..1975d06986fa 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -255,29 +255,30 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
ssp->srcu_sup->sda_is_static = is_static;
if (!is_static)
ssp->sda = alloc_percpu(struct srcu_data);
- if (!ssp->sda) {
- if (!is_static)
- kfree(ssp->srcu_sup);
- return -ENOMEM;
- }
+ if (!ssp->sda)
+ goto err_free_sup;
init_srcu_struct_data(ssp);
ssp->srcu_sup->srcu_gp_seq_needed_exp = 0;
ssp->srcu_sup->srcu_last_gp_end = ktime_get_mono_fast_ns();
if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
- if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) {
- if (!ssp->srcu_sup->sda_is_static) {
- free_percpu(ssp->sda);
- ssp->sda = NULL;
- kfree(ssp->srcu_sup);
- return -ENOMEM;
- }
- } else {
+ if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC))
+ goto err_free_sda;
+ else
WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG);
- }
}
ssp->srcu_sup->srcu_ssp = ssp;
smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, 0); /* Init done. */
return 0;
+
+err_free_sda:
+ if (!is_static) {
+ free_percpu(ssp->sda);
+ ssp->sda = NULL;
+ }
+err_free_sup:
+ if (!is_static)
+ kfree(ssp->srcu_sup);
+ return -ENOMEM;
}
#ifdef CONFIG_DEBUG_LOCK_ALLOC
On Tue, Apr 04, 2023 at 12:35:08AM +0000, Joel Fernandes wrote:
> On Thu, Mar 30, 2023 at 03:47:10PM -0700, Paul E. McKenney wrote:
> > The current srcu_struct structure is on the order of 200 bytes in size
> > (depending on architecture and .config), which is much better than the
> > old-style 26K bytes, but still all too inconvenient when one is trying
> > to achieve good cache locality on a fastpath involving SRCU readers.
> >
> > However, only a few fields in srcu_struct are used by SRCU readers.
> > The remaining fields could be offloaded to a new srcu_update
> > structure, thus shrinking the srcu_struct structure down to a few
> > tens of bytes. This commit begins this noble quest, a quest that is
> > complicated by open-coded initialization of the srcu_struct within the
> > srcu_notifier_head structure. This complication is addressed by updating
> > the srcu_notifier_head structure's open coding, given that there does
> > not appear to be a straightforward way of abstracting that initialization.
> >
> > This commit moves only the ->node pointer to srcu_update. Later commits
> > will move additional fields.
> >
> > [ paulmck: Fold in [email protected]'s memory-leak fix. ]
> >
> > Link: https://lore.kernel.org/all/[email protected]/
> > Suggested-by: Christoph Hellwig <[email protected]>
>
> [..]
> > @@ -236,8 +236,12 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
> > */
> > static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
> > {
> > + if (!is_static)
> > + ssp->srcu_sup = kzalloc(sizeof(*ssp->srcu_sup), GFP_KERNEL);
> > + if (!ssp->srcu_sup)
> > + return -ENOMEM;
> > ssp->srcu_size_state = SRCU_SIZE_SMALL;
> > - ssp->node = NULL;
> > + ssp->srcu_sup->node = NULL;
> > mutex_init(&ssp->srcu_cb_mutex);
> > mutex_init(&ssp->srcu_gp_mutex);
> > ssp->srcu_idx = 0;
> > @@ -249,8 +253,11 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
> > ssp->sda_is_static = is_static;
> > if (!is_static)
> > ssp->sda = alloc_percpu(struct srcu_data);
> > - if (!ssp->sda)
> > + if (!ssp->sda) {
> > + if (!is_static)
> > + kfree(ssp->srcu_sup);
> > return -ENOMEM;
> > + }
> > init_srcu_struct_data(ssp);
> > ssp->srcu_gp_seq_needed_exp = 0;
> > ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
> > @@ -259,6 +266,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
>
> [1] Here there is an if (!init_srcu_struct_nodes(...)) that the diff does not show.
>
> > if (!ssp->sda_is_static) {
> > free_percpu(ssp->sda);
> > ssp->sda = NULL;
> > + kfree(ssp->srcu_sup);
> > return -ENOMEM;
> > }
> > } else {
>
>
> Just a comment about the original code with reference to [1].
>
> Here if allocations in init_srcu_struct_nodes() fail, it will return false
> and execute the "if (!ssp->sda_is_is_static)" bit.
>
> So if the allocation in [1] fails, then if sda_is_static is true, we return
> -ENOMEM, however if sda_is_static is false, we do the following:
>
> ssp->srcu_sup->srcu_ssp = ssp;
> smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, 0); /* Init done. */
> return 0;
>
> Is that really correct?
Doesn't look like it, now that you mention it. Good catch, thank you!
> In other words, if init_srcu_struct_nodes() returns false, then passing along
> the return value of init_srcu_struct_nodes() back to the caller of
> init_srcu_struct_fields() depends on whether is_static = true or false. That
> seems a bit wrong to me, init_srcu_struct_fields() should always return
> -ENOMEM when init_srcu_struct_nodes() fails to allocate memory IMHO, whether
> is_static is true or not.
>
> Sorry if I missed something subtle, and if the code is correct to begin with.
> Also I feel the return paths could be made better to also fix the above issue
> I mentioned. How about the following diff on top of the series, would it
> work?
Your restructuring looks like an excellent step forward, but given the late
date, it might be best to avoid being in a rush.
I -could- make the following small patch:
if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) {
if (!ssp->srcu_sup->sda_is_static) {
free_percpu(ssp->sda);
ssp->sda = NULL;
kfree(ssp->srcu_sup);
}
return -ENOMEM;
} else {
WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG);
}
}
Except that this is a pre-existing bug that as far as we know no one
is hitting, so the risk doesn't seem to stack up. After all, if you
are hitting memory exhaustion at boot or on module load, this bug is
probably the least of your problems. Even this fix looks to be v6.5
material to me.
So would you be willing to send a patch like the above that fixes the
bug, and another like you have below to get a better error-path
structure? No hurry, the end of this month is perfectly fine.
And again, good catch!
Thanx, Paul
> Thanks!
>
> ---8<-----------------------
>
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index a887cfc89894..1975d06986fa 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -255,29 +255,30 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
> ssp->srcu_sup->sda_is_static = is_static;
> if (!is_static)
> ssp->sda = alloc_percpu(struct srcu_data);
> - if (!ssp->sda) {
> - if (!is_static)
> - kfree(ssp->srcu_sup);
> - return -ENOMEM;
> - }
> + if (!ssp->sda)
> + goto err_free_sup;
> init_srcu_struct_data(ssp);
> ssp->srcu_sup->srcu_gp_seq_needed_exp = 0;
> ssp->srcu_sup->srcu_last_gp_end = ktime_get_mono_fast_ns();
> if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
> - if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) {
> - if (!ssp->srcu_sup->sda_is_static) {
> - free_percpu(ssp->sda);
> - ssp->sda = NULL;
> - kfree(ssp->srcu_sup);
> - return -ENOMEM;
> - }
> - } else {
> + if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC))
> + goto err_free_sda;
> + else
> WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG);
> - }
> }
> ssp->srcu_sup->srcu_ssp = ssp;
> smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, 0); /* Init done. */
> return 0;
> +
> +err_free_sda:
> + if (!is_static) {
> + free_percpu(ssp->sda);
> + ssp->sda = NULL;
> + }
> +err_free_sup:
> + if (!is_static)
> + kfree(ssp->srcu_sup);
> + return -ENOMEM;
> }
>
> #ifdef CONFIG_DEBUG_LOCK_ALLOC
On Mon, Apr 3, 2023 at 9:06 PM Paul E. McKenney <[email protected]> wrote:
[..]
> > In other words, if init_srcu_struct_nodes() returns false, then passing along
> > the return value of init_srcu_struct_nodes() back to the caller of
> > init_srcu_struct_fields() depends on whether is_static = true or false. That
> > seems a bit wrong to me, init_srcu_struct_fields() should always return
> > -ENOMEM when init_srcu_struct_nodes() fails to allocate memory IMHO, whether
> > is_static is true or not.
> >
> > Sorry if I missed something subtle, and if the code is correct to begin with.
> > Also I feel the return paths could be made better to also fix the above issue
> > I mentioned. How about the following diff on top of the series, would it
> > work?
>
> Your restructuring looks like an excellent step forward, but given the late
> date, it might be best to avoid being in a rush.
>
> I -could- make the following small patch:
>
> if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
> if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) {
> if (!ssp->srcu_sup->sda_is_static) {
> free_percpu(ssp->sda);
> ssp->sda = NULL;
> kfree(ssp->srcu_sup);
> }
> return -ENOMEM;
> } else {
> WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG);
> }
> }
>
> Except that this is a pre-existing bug that as far as we know no one
> is hitting, so the risk doesn't seem to stack up. After all, if you
> are hitting memory exhaustion at boot or on module load, this bug is
> probably the least of your problems. Even this fix looks to be v6.5
> material to me.
>
> So would you be willing to send a patch like the above that fixes the
> bug, and another like you have below to get a better error-path
> structure? No hurry, the end of this month is perfectly fine.
Yes, I am happy to send patches along these lines by the end of the
month. I will make a note to do so.
> And again, good catch!
Thanks!
- Joel
On Thu, Mar 30, 2023 at 03:47:02PM -0700, Paul E. McKenney wrote:
> Hello!
>
> This post-RFC series shrinks the srcu_struct structure to the bare minimum
> required to support SRCU readers, relegating the remaining fields to a new
> srcu_usage structure. Statically allocated srcu_struct structures created
> by DEFINE_SRCU() and DEFINE_STATIC_SRCU() have statically allocated
> srcu_usage structures, but those required for dynamically allocated
> srcu_struct structures that are initialized using init_srcu_struct()
> are dynamically allocated.
>
> The results is a reduction in the size of an srcu_struct structure from
> a couple hundred bytes to just 24 bytes on x86_64 systems. This can be
> helpful when SRCU readers are used in a fastpath for which the srcu_struct
> structure must be embedded in another structure, and especially where
> that fastpath also needs to access fields both before and after the
> srcu_struct structure.
>
> This series takes baby steps, in part because breaking SRCU means that
> you get absolutely no console output. Yes, I did learn this the hard way.
> Why do you ask? ;-)
>
> Here are those baby steps:
>
> 1. rcu-tasks: Fix warning for unused tasks_rcu_exit_srcu.
>
> 2. Add whitespace to __SRCU_STRUCT_INIT() & __DEFINE_SRCU().
>
> 3. Use static init for statically allocated in-module srcu_struct.
>
> 4. Begin offloading srcu_struct fields to srcu_update.
>
> 5. Move ->level from srcu_struct to srcu_usage.
>
> 6. Move ->srcu_size_state from srcu_struct to srcu_usage.
>
> 7. Move ->srcu_cb_mutex from srcu_struct to srcu_usage.
>
> 8. Move ->lock initialization after srcu_usage allocation.
>
> 9. Move ->lock from srcu_struct to srcu_usage.
>
> 10. Move ->srcu_gp_mutex from srcu_struct to srcu_usage.
>
> 11. Move grace-period fields from srcu_struct to srcu_usage.
>
> 12. Move heuristics fields from srcu_struct to srcu_usage.
>
> 13. Move ->sda_is_static from srcu_struct to srcu_usage.
>
> 14. Move srcu_barrier() fields from srcu_struct to srcu_usage.
>
> 15. Move work-scheduling fields from srcu_struct to srcu_usage.
>
> 16. Check for readers at module-exit time.
>
> 17. Fix long lines in srcu_get_delay().
>
> 18. Fix long lines in cleanup_srcu_struct().
>
> 19. Fix long lines in srcu_gp_end().
>
> 20. Fix long lines in srcu_funnel_gp_start().
>
> Changes since the RFC series:
> https://lore.kernel.org/all/3db82572-f156-4a5d-b711-841aa28bd996@paulmck-laptop/
>
> 1. Add checks for readers of in-module statically allocated
> srcu_struct structures persisting past module unload.
>
> 2. Apply Tested-by tags.
>
> 3. Apply feedback from "Zhang, Qiang1" and kernel test robot,
> perhaps most notably getting rid of memory leaks and improving
> the handling of statically allocated srcu_struct structures
> defined within modules.
>
> 4. Drop the commit removing extraneous parentheses given the desire
> to push this into the v6.4 merge window, the fact that this
> commit generates conflicts with other v6.4 RCU commits, and the
> low value of this commit. It therefore remains in the v6.5 pile.
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> b/include/linux/notifier.h | 5
> b/include/linux/srcutiny.h | 6
> b/include/linux/srcutree.h | 28 +-
> b/kernel/rcu/rcu.h | 6
> b/kernel/rcu/srcutree.c | 19 +
> b/kernel/rcu/tasks.h | 2
> include/linux/srcutree.h | 123 ++++++-----
> kernel/rcu/srcutree.c | 495 +++++++++++++++++++++++----------------------
> 8 files changed, 370 insertions(+), 314 deletions(-)
It looks good on my ARM64 board:
Tested-by: Joel Fernandes (Google) <[email protected]>
Output of run:
tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 2h --configs SRCU-P SRCU-N SRCU-T SRCU-U --trust-make
SRCU-N ------- 204135 GPs (28.3521/s) [srcu: g1765408 f0x0 total-gps=1765408] n_max_cbs: 150000
:CONFIG_HYPERVISOR_GUEST=y: improperly set
:CONFIG_KVM_GUEST=y: improperly set
SRCU-P ------- 105511 GPs (14.6543/s) [srcud: g884128 f0x0 total-gps=884128] n_max_cbs: 150000
:CONFIG_HYPERVISOR_GUEST=y: improperly set
:CONFIG_KVM_GUEST=y: improperly set
SRCU-T ------- 334055 GPs (46.3965/s) [srcu: g2638072 f0x0 total-gps=2638072] n_max_cbs: 50000
:CONFIG_HYPERVISOR_GUEST=y: improperly set
:CONFIG_KVM_GUEST=y: improperly set
:CONFIG_SMP: improperly set
:CONFIG_TINY_SRCU=y: improperly set
SRCU-U ------- 292738 GPs (40.6581/s) [srcud: g2349416 f0x0 total-gps=2349416] n_max_cbs: 50000
:CONFIG_HYPERVISOR_GUEST=y: improperly set
:CONFIG_KVM_GUEST=y: improperly set
:CONFIG_SMP: improperly set
:CONFIG_TINY_SRCU=y: improperly set
1 runs with build errors.
That "build error" is actually perl doing this:
perl: warning: Please check that your locale settings:
perl: warning: Falling back to the standard locale ("C").
I think its harmless and the test did fine. It is just that my chroot is
missing some packages (I have run into this warning before).
The "improperly set" thingies are perhaps Kconfig on ARM64 setting some KVM
options slightly differently. I have seen that before as well on this board.
thanks,
- Joel
On Tue, Apr 04, 2023 at 01:57:41PM +0000, Joel Fernandes wrote:
> On Thu, Mar 30, 2023 at 03:47:02PM -0700, Paul E. McKenney wrote:
> > Hello!
> >
> > This post-RFC series shrinks the srcu_struct structure to the bare minimum
> > required to support SRCU readers, relegating the remaining fields to a new
> > srcu_usage structure. Statically allocated srcu_struct structures created
> > by DEFINE_SRCU() and DEFINE_STATIC_SRCU() have statically allocated
> > srcu_usage structures, but those required for dynamically allocated
> > srcu_struct structures that are initialized using init_srcu_struct()
> > are dynamically allocated.
> >
> > The results is a reduction in the size of an srcu_struct structure from
> > a couple hundred bytes to just 24 bytes on x86_64 systems. This can be
> > helpful when SRCU readers are used in a fastpath for which the srcu_struct
> > structure must be embedded in another structure, and especially where
> > that fastpath also needs to access fields both before and after the
> > srcu_struct structure.
> >
> > This series takes baby steps, in part because breaking SRCU means that
> > you get absolutely no console output. Yes, I did learn this the hard way.
> > Why do you ask? ;-)
> >
> > Here are those baby steps:
> >
> > 1. rcu-tasks: Fix warning for unused tasks_rcu_exit_srcu.
> >
> > 2. Add whitespace to __SRCU_STRUCT_INIT() & __DEFINE_SRCU().
> >
> > 3. Use static init for statically allocated in-module srcu_struct.
> >
> > 4. Begin offloading srcu_struct fields to srcu_update.
> >
> > 5. Move ->level from srcu_struct to srcu_usage.
> >
> > 6. Move ->srcu_size_state from srcu_struct to srcu_usage.
> >
> > 7. Move ->srcu_cb_mutex from srcu_struct to srcu_usage.
> >
> > 8. Move ->lock initialization after srcu_usage allocation.
> >
> > 9. Move ->lock from srcu_struct to srcu_usage.
> >
> > 10. Move ->srcu_gp_mutex from srcu_struct to srcu_usage.
> >
> > 11. Move grace-period fields from srcu_struct to srcu_usage.
> >
> > 12. Move heuristics fields from srcu_struct to srcu_usage.
> >
> > 13. Move ->sda_is_static from srcu_struct to srcu_usage.
> >
> > 14. Move srcu_barrier() fields from srcu_struct to srcu_usage.
> >
> > 15. Move work-scheduling fields from srcu_struct to srcu_usage.
> >
> > 16. Check for readers at module-exit time.
> >
> > 17. Fix long lines in srcu_get_delay().
> >
> > 18. Fix long lines in cleanup_srcu_struct().
> >
> > 19. Fix long lines in srcu_gp_end().
> >
> > 20. Fix long lines in srcu_funnel_gp_start().
> >
> > Changes since the RFC series:
> > https://lore.kernel.org/all/3db82572-f156-4a5d-b711-841aa28bd996@paulmck-laptop/
> >
> > 1. Add checks for readers of in-module statically allocated
> > srcu_struct structures persisting past module unload.
> >
> > 2. Apply Tested-by tags.
> >
> > 3. Apply feedback from "Zhang, Qiang1" and kernel test robot,
> > perhaps most notably getting rid of memory leaks and improving
> > the handling of statically allocated srcu_struct structures
> > defined within modules.
> >
> > 4. Drop the commit removing extraneous parentheses given the desire
> > to push this into the v6.4 merge window, the fact that this
> > commit generates conflicts with other v6.4 RCU commits, and the
> > low value of this commit. It therefore remains in the v6.5 pile.
> >
> > Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > b/include/linux/notifier.h | 5
> > b/include/linux/srcutiny.h | 6
> > b/include/linux/srcutree.h | 28 +-
> > b/kernel/rcu/rcu.h | 6
> > b/kernel/rcu/srcutree.c | 19 +
> > b/kernel/rcu/tasks.h | 2
> > include/linux/srcutree.h | 123 ++++++-----
> > kernel/rcu/srcutree.c | 495 +++++++++++++++++++++++----------------------
> > 8 files changed, 370 insertions(+), 314 deletions(-)
>
> It looks good on my ARM64 board:
> Tested-by: Joel Fernandes (Google) <[email protected]>
Thank you! I will apply on my rebase later today.
> Output of run:
>
> tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 2h --configs SRCU-P SRCU-N SRCU-T SRCU-U --trust-make
> SRCU-N ------- 204135 GPs (28.3521/s) [srcu: g1765408 f0x0 total-gps=1765408] n_max_cbs: 150000
> :CONFIG_HYPERVISOR_GUEST=y: improperly set
> :CONFIG_KVM_GUEST=y: improperly set
> SRCU-P ------- 105511 GPs (14.6543/s) [srcud: g884128 f0x0 total-gps=884128] n_max_cbs: 150000
> :CONFIG_HYPERVISOR_GUEST=y: improperly set
> :CONFIG_KVM_GUEST=y: improperly set
> SRCU-T ------- 334055 GPs (46.3965/s) [srcu: g2638072 f0x0 total-gps=2638072] n_max_cbs: 50000
> :CONFIG_HYPERVISOR_GUEST=y: improperly set
> :CONFIG_KVM_GUEST=y: improperly set
> :CONFIG_SMP: improperly set
> :CONFIG_TINY_SRCU=y: improperly set
> SRCU-U ------- 292738 GPs (40.6581/s) [srcud: g2349416 f0x0 total-gps=2349416] n_max_cbs: 50000
> :CONFIG_HYPERVISOR_GUEST=y: improperly set
> :CONFIG_KVM_GUEST=y: improperly set
> :CONFIG_SMP: improperly set
> :CONFIG_TINY_SRCU=y: improperly set
> 1 runs with build errors.
>
> That "build error" is actually perl doing this:
>
> perl: warning: Please check that your locale settings:
> perl: warning: Falling back to the standard locale ("C").
>
> I think its harmless and the test did fine. It is just that my chroot is
> missing some packages (I have run into this warning before).
Frederic ran into some similar messages where the scripts implicitly
assumed en_us locale, but his warnings were specific to rcutorture.
> The "improperly set" thingies are perhaps Kconfig on ARM64 setting some KVM
> options slightly differently. I have seen that before as well on this board.
Agreed, and one of the things on the long list is to allow arch-specific
settings for those options.
The CONFIG_SMP warnings are interesting, though. Does arm64 disallow
!SMP builds? ;-)
Thanx, Paul
[..]
> > The "improperly set" thingies are perhaps Kconfig on ARM64 setting some KVM
> > options slightly differently. I have seen that before as well on this board.
>
> Agreed, and one of the things on the long list is to allow arch-specific
> settings for those options.
>
> The CONFIG_SMP warnings are interesting, though. Does arm64 disallow
> !SMP builds? ;-)
Yes, it looks like it. As per arch/arm64/Kconfig, CONFIG_SMP is always
'y' and cannot be set.
config SMP
def_bool y
Thanks.
On Tue, Apr 04, 2023 at 01:01:50PM -0400, Joel Fernandes wrote:
> [..]
> > > The "improperly set" thingies are perhaps Kconfig on ARM64 setting some KVM
> > > options slightly differently. I have seen that before as well on this board.
> >
> > Agreed, and one of the things on the long list is to allow arch-specific
> > settings for those options.
> >
> > The CONFIG_SMP warnings are interesting, though. Does arm64 disallow
> > !SMP builds? ;-)
>
> Yes, it looks like it. As per arch/arm64/Kconfig, CONFIG_SMP is always
> 'y' and cannot be set.
>
> config SMP
> def_bool y
That would do it! And of course this also explains the CONFIG_TINY_SRCU
complaints.
Thanx, Paul
Hi Paul,
On 30/03/2023 23:47, Paul E. McKenney wrote:
> This commit moves the ->srcu_gp_seq, ->srcu_gp_seq_needed,
> ->srcu_gp_seq_needed_exp, ->srcu_gp_start, and ->srcu_last_gp_end fields
> from the srcu_struct structure to the srcu_usage structure to reduce
> the size of the former in order to improve cache locality.
>
> Suggested-by: Christoph Hellwig <[email protected]>
> Tested-by: Sachin Sant <[email protected]>
> Tested-by: "Zhang, Qiang1" <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>
I have noticed a suspend regression on some of our Tegra boards recently with v6.4-rc and interestingly bisect is pointing to this commit. I was unable revert this on top of the latest mainline but if I checkout this commit suspend fails and if I checkout the previous commit is passes.
Enabling more debug I was able to capture the following crash log I see on one of the boards ...
[ 57.327645] PM: suspend entry (deep)
[ 57.331660] Filesystems sync: 0.000 seconds
[ 57.340147] Freezing user space processes
[ 57.347470] Freezing user space processes completed (elapsed 0.007 seconds)
[ 57.347501] OOM killer disabled.
[ 57.347508] Freezing remaining freezable tasks
[ 57.348834] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[ 57.349932] 8<--- cut here ---
[ 57.349943] Unable to handle kernel NULL pointer dereference at virtual address 00000000 when write
[ 57.349960] [00000000] *pgd=00000000
[ 57.349986] Internal error: Oops: 805 [#1] PREEMPT SMP ARM
[ 57.350007] Modules linked in: tegra30_tsensor
[ 57.350033] CPU: 0 PID: 589 Comm: rtcwake Not tainted 6.3.0-rc1-00011-g03200b5ca3b4-dirty #3
[ 57.350057] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
[ 57.350067] PC is at rcu_segcblist_enqueue+0x2c/0x38
[ 57.350120] LR is at srcu_gp_start_if_needed+0xe4/0x544
[ 57.350169] pc : [<c01a5120>] lr : [<c0198b5c>] psr: a0070093
[ 57.350183] sp : f0b2dd20 ip : 3b5870ef fp : 00000000
[ 57.350194] r10: ef787d84 r9 : 00000000 r8 : ef787d80
[ 57.350205] r7 : 80070013 r6 : c131ec30 r5 : ef787d40 r4 : f0b2dd64
[ 57.350217] r3 : 00000000 r2 : 00000000 r1 : f0b2dd64 r0 : ef787d84
[ 57.350230] Flags: NzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none
[ 57.350251] Control: 10c5387d Table: 81d8004a DAC: 00000051
[ 57.350261] Register r0 information: non-slab/vmalloc memory
[ 57.350283] Register r1 information: 2-page vmalloc region starting at 0xf0b2c000 allocated at kernel_clone+0xb4/0x3e4
[ 57.350322] Register r2 information: NULL pointer
[ 57.350337] Register r3 information: NULL pointer
[ 57.350350] Register r4 information: 2-page vmalloc region starting at 0xf0b2c000 allocated at kernel_clone+0xb4/0x3e4
[ 57.350379] Register r5 information: non-slab/vmalloc memory
[ 57.350394] Register r6 information: non-slab/vmalloc memory
[ 57.350408] Register r7 information: non-paged memory
[ 57.350422] Register r8 information: non-slab/vmalloc memory
[ 57.350436] Register r9 information: NULL pointer
[ 57.350449] Register r10 information: non-slab/vmalloc memory
[ 57.350463] Register r11 information: NULL pointer
[ 57.350477] Register r12 information: non-paged memory
[ 57.350491] Process rtcwake (pid: 589, stack limit = 0x410bb531)
[ 57.350510] Stack: (0xf0b2dd20 to 0xf0b2e000)
[ 57.350534] dd20: 00000000 c1ee4a40 f0b2dd7c c0184f24 ef781495 3b5870ef c1ee4a40 c1ee4a40
[ 57.350555] dd40: c131ec30 00000000 00000002 c0f3d1fc c3542ac0 c2abbb10 c1ee4a40 c0199044
[ 57.350574] dd60: 60070013 00000000 c0195924 00000000 00000000 f0b2dd74 f0b2dd74 3b5870ef
[ 57.350592] dd80: 00000000 c131ebc0 c120ab28 c0146d9c c2785b94 c2785b40 c0fee9f4 c0872590
[ 57.350611] dda0: c2785b40 c08c39cc c2785b40 c08c3a3c c2788c00 c08c40b0 c0f3d1fc c066f028
[ 57.350630] ddc0: f0b2de14 c2788c00 c1325ef4 c08c1d40 c2788c00 c1325ef4 c0fee9f4 c08c31cc
[ 57.350648] dde0: c13708a0 0000000d 00000000 c0681c10 c16afe84 00000002 56508788 0000000d
[ 57.350665] de00: 00000002 c13708a0 10624dd3 56409580 0000000d 00000000 00000002 c0f3d1fc
[ 57.350685] de20: c3542ac0 c2abbb10 c1ee4a40 c06824e4 00000000 ffffa900 00000000 c1386510
[ 57.350703] de40: 00000003 00000003 c1204f75 c017e8a8 c3542ac0 c2abbb10 00428228 c0171574
[ 57.350721] de60: 00000000 00000000 00000003 3b5870ef c1204f75 00000000 00000003 c137aeb4
[ 57.350739] de80: c1204f75 c0f3d1fc c3542ac0 c2abbb10 00428228 c017f380 00000003 c0f38a54
[ 57.350757] dea0: 00000003 c1386524 00000004 c017d708 00000004 c2abbb00 00000000 00000000
[ 57.350775] dec0: c3542ac0 f0b2df28 c2abbb10 c03305b4 00000000 00000000 c2953c00 c1ee4a40
[ 57.350794] dee0: 00429438 00000004 c0d18488 00004004 00000000 c02b1094 00000a55 c1d80010
[ 57.350812] df00: c1d80010 00000000 00000000 f0b2df78 01010006 00000004 00000000 00429438
[ 57.350830] df20: 00000000 00000000 c2953c00 00000000 00000000 00000000 00000000 00000000
[ 57.350848] df40: 00000000 00004004 00000000 00000000 0000006c 3b5870ef c2953c00 c2953c00
[ 57.350866] df60: 00000000 00000000 c1ee4a40 00429438 00000004 c02b12c8 00000000 00000000
[ 57.350885] df80: 00001008 3b5870ef 0000006c 00429438 00428228 00000004 c0100324 c1ee4a40
[ 57.350902] dfa0: 00000004 c01000c0 0000006c 00429438 00000004 00429438 00000004 00000000
[ 57.350920] dfc0: 0000006c 00429438 00428228 00000004 00000004 00000004 0041578c 00428228
[ 57.350938] dfe0: 00000004 becda9a8 b6e9bc0b b6e26206 600f0030 00000004 00000000 00000000
[ 57.350960] rcu_segcblist_enqueue from srcu_gp_start_if_needed+0xe4/0x544
[ 57.351023] srcu_gp_start_if_needed from __synchronize_srcu.part.6+0x70/0x98
[ 57.351084] __synchronize_srcu.part.6 from srcu_notifier_chain_unregister+0x6c/0xdc
[ 57.351155] srcu_notifier_chain_unregister from cpufreq_unregister_notifier+0x60/0xbc
[ 57.351215] cpufreq_unregister_notifier from tegra_actmon_pause.part.0+0x1c/0x54
[ 57.351277] tegra_actmon_pause.part.0 from tegra_actmon_stop+0x38/0x3c
[ 57.351324] tegra_actmon_stop from tegra_governor_event_handler+0x100/0x11c
[ 57.351373] tegra_governor_event_handler from devfreq_suspend_device+0x64/0xac
[ 57.351423] devfreq_suspend_device from devfreq_suspend+0x30/0x64
[ 57.351467] devfreq_suspend from dpm_suspend+0x34/0x33c
[ 57.351506] dpm_suspend from dpm_suspend_start+0x90/0x98
[ 57.351528] dpm_suspend_start from suspend_devices_and_enter+0xe4/0x93c
[ 57.351573] suspend_devices_and_enter from pm_suspend+0x280/0x3ac
[ 57.351614] pm_suspend from state_store+0x6c/0xc8
[ 57.351654] state_store from kernfs_fop_write_iter+0x118/0x1b4
[ 57.351696] kernfs_fop_write_iter from vfs_write+0x314/0x3d4
[ 57.351733] vfs_write from ksys_write+0xa0/0xd0
[ 57.351760] ksys_write from ret_fast_syscall+0x0/0x54
[ 57.351788] Exception stack(0xf0b2dfa8 to 0xf0b2dff0)
[ 57.351809] dfa0: 0000006c 00429438 00000004 00429438 00000004 00000000
[ 57.351828] dfc0: 0000006c 00429438 00428228 00000004 00000004 00000004 0041578c 00428228
[ 57.351843] dfe0: 00000004 becda9a8 b6e9bc0b b6e26206
[ 57.351863] Code: e2833001 e5803034 e5812000 e5903010 (e5831000)
[ 57.351875] ---[ end trace 0000000000000000 ]---
I have not dug into this yet and so wanted to see if you have any thoughts on this?
Thanks!
Jon
--
nvpublic
On Thu, Jun 01, 2023 at 08:33:10PM +0800, Z qiang wrote:
> >
> > Hi Paul,
> >
> > On 30/03/2023 23:47, Paul E. McKenney wrote:
> > > This commit moves the ->srcu_gp_seq, ->srcu_gp_seq_needed,
> > > ->srcu_gp_seq_needed_exp, ->srcu_gp_start, and ->srcu_last_gp_end fields
> > > from the srcu_struct structure to the srcu_usage structure to reduce
> > > the size of the former in order to improve cache locality.
> > >
> > > Suggested-by: Christoph Hellwig <[email protected]>
> > > Tested-by: Sachin Sant <[email protected]>
> > > Tested-by: "Zhang, Qiang1" <[email protected]>
> > > Signed-off-by: Paul E. McKenney <[email protected]>
> >
> >
> > I have noticed a suspend regression on some of our Tegra boards recently
> with v6.4-rc and interestingly bisect is pointing to this commit. I was
> unable revert this on top of the latest mainline but if I checkout this
> commit suspend fails and if I checkout the previous commit is passes.
> >
> > Enabling more debug I was able to capture the following crash log I see
> on one of the boards ...
> >
> > [ 57.327645] PM: suspend entry (deep)
> > [ 57.331660] Filesystems sync: 0.000 seconds
> > [ 57.340147] Freezing user space processes
> > [ 57.347470] Freezing user space processes completed (elapsed 0.007
> seconds)
> > [ 57.347501] OOM killer disabled.
> > [ 57.347508] Freezing remaining freezable tasks
> > [ 57.348834] Freezing remaining freezable tasks completed (elapsed
> 0.001 seconds)
> > [ 57.349932] 8<--- cut here ---
> > [ 57.349943] Unable to handle kernel NULL pointer dereference at
> virtual address 00000000 when write
> > [ 57.349960] [00000000] *pgd=00000000
> > [ 57.349986] Internal error: Oops: 805 [#1] PREEMPT SMP ARM
> > [ 57.350007] Modules linked in: tegra30_tsensor
> > [ 57.350033] CPU: 0 PID: 589 Comm: rtcwake Not tainted
> 6.3.0-rc1-00011-g03200b5ca3b4-dirty #3
> > [ 57.350057] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
> > [ 57.350067] PC is at rcu_segcblist_enqueue+0x2c/0x38
> > [ 57.350120] LR is at srcu_gp_start_if_needed+0xe4/0x544
> > [ 57.350169] pc : [<c01a5120>] lr : [<c0198b5c>] psr: a0070093
> > [ 57.350183] sp : f0b2dd20 ip : 3b5870ef fp : 00000000
> > [ 57.350194] r10: ef787d84 r9 : 00000000 r8 : ef787d80
> > [ 57.350205] r7 : 80070013 r6 : c131ec30 r5 : ef787d40 r4 : f0b2dd64
> > [ 57.350217] r3 : 00000000 r2 : 00000000 r1 : f0b2dd64 r0 : ef787d84
> > [ 57.350230] Flags: NzCv IRQs off FIQs on Mode SVC_32 ISA ARM
> Segment none
> > [ 57.350251] Control: 10c5387d Table: 81d8004a DAC: 00000051
> > [ 57.350261] Register r0 information: non-slab/vmalloc memory
> > [ 57.350283] Register r1 information: 2-page vmalloc region starting at
> 0xf0b2c000 allocated at kernel_clone+0xb4/0x3e4
> > [ 57.350322] Register r2 information: NULL pointer
> > [ 57.350337] Register r3 information: NULL pointer
> > [ 57.350350] Register r4 information: 2-page vmalloc region starting at
> 0xf0b2c000 allocated at kernel_clone+0xb4/0x3e4
> > [ 57.350379] Register r5 information: non-slab/vmalloc memory
> > [ 57.350394] Register r6 information: non-slab/vmalloc memory
> > [ 57.350408] Register r7 information: non-paged memory
> > [ 57.350422] Register r8 information: non-slab/vmalloc memory
> > [ 57.350436] Register r9 information: NULL pointer
> > [ 57.350449] Register r10 information: non-slab/vmalloc memory
> > [ 57.350463] Register r11 information: NULL pointer
> > [ 57.350477] Register r12 information: non-paged memory
> > [ 57.350491] Process rtcwake (pid: 589, stack limit = 0x410bb531)
> > [ 57.350510] Stack: (0xf0b2dd20 to 0xf0b2e000)
> > [ 57.350534] dd20: 00000000 c1ee4a40 f0b2dd7c c0184f24 ef781495
> 3b5870ef c1ee4a40 c1ee4a40
> > [ 57.350555] dd40: c131ec30 00000000 00000002 c0f3d1fc c3542ac0
> c2abbb10 c1ee4a40 c0199044
> > [ 57.350574] dd60: 60070013 00000000 c0195924 00000000 00000000
> f0b2dd74 f0b2dd74 3b5870ef
> > [ 57.350592] dd80: 00000000 c131ebc0 c120ab28 c0146d9c c2785b94
> c2785b40 c0fee9f4 c0872590
> > [ 57.350611] dda0: c2785b40 c08c39cc c2785b40 c08c3a3c c2788c00
> c08c40b0 c0f3d1fc c066f028
> > [ 57.350630] ddc0: f0b2de14 c2788c00 c1325ef4 c08c1d40 c2788c00
> c1325ef4 c0fee9f4 c08c31cc
> > [ 57.350648] dde0: c13708a0 0000000d 00000000 c0681c10 c16afe84
> 00000002 56508788 0000000d
> > [ 57.350665] de00: 00000002 c13708a0 10624dd3 56409580 0000000d
> 00000000 00000002 c0f3d1fc
> > [ 57.350685] de20: c3542ac0 c2abbb10 c1ee4a40 c06824e4 00000000
> ffffa900 00000000 c1386510
> > [ 57.350703] de40: 00000003 00000003 c1204f75 c017e8a8 c3542ac0
> c2abbb10 00428228 c0171574
> > [ 57.350721] de60: 00000000 00000000 00000003 3b5870ef c1204f75
> 00000000 00000003 c137aeb4
> > [ 57.350739] de80: c1204f75 c0f3d1fc c3542ac0 c2abbb10 00428228
> c017f380 00000003 c0f38a54
> > [ 57.350757] dea0: 00000003 c1386524 00000004 c017d708 00000004
> c2abbb00 00000000 00000000
> > [ 57.350775] dec0: c3542ac0 f0b2df28 c2abbb10 c03305b4 00000000
> 00000000 c2953c00 c1ee4a40
> > [ 57.350794] dee0: 00429438 00000004 c0d18488 00004004 00000000
> c02b1094 00000a55 c1d80010
> > [ 57.350812] df00: c1d80010 00000000 00000000 f0b2df78 01010006
> 00000004 00000000 00429438
> > [ 57.350830] df20: 00000000 00000000 c2953c00 00000000 00000000
> 00000000 00000000 00000000
> > [ 57.350848] df40: 00000000 00004004 00000000 00000000 0000006c
> 3b5870ef c2953c00 c2953c00
> > [ 57.350866] df60: 00000000 00000000 c1ee4a40 00429438 00000004
> c02b12c8 00000000 00000000
> > [ 57.350885] df80: 00001008 3b5870ef 0000006c 00429438 00428228
> 00000004 c0100324 c1ee4a40
> > [ 57.350902] dfa0: 00000004 c01000c0 0000006c 00429438 00000004
> 00429438 00000004 00000000
> > [ 57.350920] dfc0: 0000006c 00429438 00428228 00000004 00000004
> 00000004 0041578c 00428228
> > [ 57.350938] dfe0: 00000004 becda9a8 b6e9bc0b b6e26206 600f0030
> 00000004 00000000 00000000
> > [ 57.350960] rcu_segcblist_enqueue from
> srcu_gp_start_if_needed+0xe4/0x544
> > [ 57.351023] srcu_gp_start_if_needed from
> __synchronize_srcu.part.6+0x70/0x98
> > [ 57.351084] __synchronize_srcu.part.6 from
> srcu_notifier_chain_unregister+0x6c/0xdc
> > [ 57.351155] srcu_notifier_chain_unregister from
> cpufreq_unregister_notifier+0x60/0xbc
> > [ 57.351215] cpufreq_unregister_notifier from
> tegra_actmon_pause.part.0+0x1c/0x54
> > [ 57.351277] tegra_actmon_pause.part.0 from tegra_actmon_stop+0x38/0x3c
> > [ 57.351324] tegra_actmon_stop from
> tegra_governor_event_handler+0x100/0x11c
> > [ 57.351373] tegra_governor_event_handler from
> devfreq_suspend_device+0x64/0xac
> > [ 57.351423] devfreq_suspend_device from devfreq_suspend+0x30/0x64
> > [ 57.351467] devfreq_suspend from dpm_suspend+0x34/0x33c
> > [ 57.351506] dpm_suspend from dpm_suspend_start+0x90/0x98
> > [ 57.351528] dpm_suspend_start from
> suspend_devices_and_enter+0xe4/0x93c
> > [ 57.351573] suspend_devices_and_enter from pm_suspend+0x280/0x3ac
> > [ 57.351614] pm_suspend from state_store+0x6c/0xc8
> > [ 57.351654] state_store from kernfs_fop_write_iter+0x118/0x1b4
> > [ 57.351696] kernfs_fop_write_iter from vfs_write+0x314/0x3d4
> > [ 57.351733] vfs_write from ksys_write+0xa0/0xd0
> > [ 57.351760] ksys_write from ret_fast_syscall+0x0/0x54
> > [ 57.351788] Exception stack(0xf0b2dfa8 to 0xf0b2dff0)
> > [ 57.351809] dfa0: 0000006c 00429438 00000004
> 00429438 00000004 00000000
> > [ 57.351828] dfc0: 0000006c 00429438 00428228 00000004 00000004
> 00000004 0041578c 00428228
> > [ 57.351843] dfe0: 00000004 becda9a8 b6e9bc0b b6e26206
> > [ 57.351863] Code: e2833001 e5803034 e5812000 e5903010 (e5831000)
> > [ 57.351875] ---[ end trace 0000000000000000 ]---
> >
> >
> > I have not dug into this yet and so wanted to see if you have any
> thoughts on this?
> >
>
> Hi, Jon
>
> Please try it:
>
> diff --git a/include/linux/notifier.h b/include/linux/notifier.h
> index 2aba75145144..3ce6b59e02e5 100644
> --- a/include/linux/notifier.h
> +++ b/include/linux/notifier.h
> @@ -110,6 +110,7 @@ extern void srcu_init_notifier_head(struct
> srcu_notifier_head *nh);
> { \
> .mutex = __MUTEX_INITIALIZER(name.mutex), \
> .head = NULL, \
> + .srcuu = __SRCU_USAGE_INIT(name.srcuu), \
> .srcu = __SRCU_STRUCT_INIT(name.srcu, name.srcuu, pcpu), \
> }
Thank you both!
Huh. It looks like Chen-Yu Tsai sent a patch to this effect and
AngeloGioacchino Del Regno tested it. No one has picked it up yet.
https://lore.kernel.org/all/[email protected]/
This is clearly a regression, and I don't see it in -next. I will pick
it up and send it along in a few days if Matthias or Rafael don't beat
me to it.
In the meantime, I would be happy to add Jon's Reported-by and Tested-by,
along with Qiang's Acked-by or Reviewed-by.
Thanx, Paul
------------------------------------------------------------------------
commit bf7da55fcdd1478839b21697cf0534be1f149a49
Author: Chen-Yu Tsai <[email protected]>
Date: Fri May 26 15:35:37 2023 +0800
notifier: Initialize new struct srcu_usage field
In commit 95433f726301 ("srcu: Begin offloading srcu_struct fields to
srcu_update"), a new struct srcu_usage field was added, but was not
properly initialized. This led to a "spinlock bad magic" BUG when the
SRCU notifier was ever used. This was observed in the MediaTek CCI
devfreq driver on next-20230525. The trimmed stack trace is as follows:
BUG: spinlock bad magic on CPU#4, swapper/0/1
lock: 0xffffff80ff529ac0, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
Call trace:
spin_bug+0xa4/0xe8
do_raw_spin_lock+0xec/0x120
_raw_spin_lock_irqsave+0x78/0xb8
synchronize_srcu+0x3c/0x168
srcu_notifier_chain_unregister+0x5c/0xa0
cpufreq_unregister_notifier+0x94/0xe0
devfreq_passive_event_handler+0x7c/0x3e0
devfreq_remove_device+0x48/0xe8
Add __SRCU_USAGE_INIT() to SRCU_NOTIFIER_INIT() so that srcu_usage gets
initialized properly.
Fixes: 95433f726301 ("srcu: Begin offloading srcu_struct fields to srcu_update")
Signed-off-by: Chen-Yu Tsai <[email protected]>
Tested-by: AngeloGioacchino Del Regno <[email protected]>
Cc: Matthias Brugger <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: "Michał Mirosław" <[email protected]>
Cc: Dmitry Osipenko <[email protected]>
Cc: Sachin Sant <[email protected]>
Cc: "Zhang, Qiang1" <[email protected]>
Cc: Joel Fernandes (Google) <[email protected]
Cc: Jon Hunter <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
diff --git a/include/linux/notifier.h b/include/linux/notifier.h
index 2aba75145144..86544707236a 100644
--- a/include/linux/notifier.h
+++ b/include/linux/notifier.h
@@ -106,12 +106,22 @@ extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
#define RAW_NOTIFIER_INIT(name) { \
.head = NULL }
+#ifdef CONFIG_TREE_SRCU
#define SRCU_NOTIFIER_INIT(name, pcpu) \
{ \
.mutex = __MUTEX_INITIALIZER(name.mutex), \
.head = NULL, \
+ .srcuu = __SRCU_USAGE_INIT(name.srcuu), \
.srcu = __SRCU_STRUCT_INIT(name.srcu, name.srcuu, pcpu), \
}
+#else
+#define SRCU_NOTIFIER_INIT(name, pcpu) \
+ { \
+ .mutex = __MUTEX_INITIALIZER(name.mutex), \
+ .head = NULL, \
+ .srcu = __SRCU_STRUCT_INIT(name.srcu, name.srcuu, pcpu), \
+ }
+#endif
#define ATOMIC_NOTIFIER_HEAD(name) \
struct atomic_notifier_head name = \
Hi Paul, Zqiang,
On 01/06/2023 14:46, Paul E. McKenney wrote:
...
> Thank you both!
>
> Huh. It looks like Chen-Yu Tsai sent a patch to this effect and
> AngeloGioacchino Del Regno tested it. No one has picked it up yet.
>
> https://lore.kernel.org/all/[email protected]/
>
> This is clearly a regression, and I don't see it in -next. I will pick
> it up and send it along in a few days if Matthias or Rafael don't beat
> me to it.
>
> In the meantime, I would be happy to add Jon's Reported-by and Tested-by,
> along with Qiang's Acked-by or Reviewed-by.
Thanks for the rapid response. Yes that does fix the problem for me so ...
Tested-by: Jon Hunter <[email protected]>
Cheers!
Jon
--
nvpublic
On Thu, Jun 01, 2023 at 06:14:21PM +0100, Jon Hunter wrote:
> Hi Paul, Zqiang,
>
> On 01/06/2023 14:46, Paul E. McKenney wrote:
>
> ...
>
> > Thank you both!
> >
> > Huh. It looks like Chen-Yu Tsai sent a patch to this effect and
> > AngeloGioacchino Del Regno tested it. No one has picked it up yet.
> >
> > https://lore.kernel.org/all/[email protected]/
> >
> > This is clearly a regression, and I don't see it in -next. I will pick
> > it up and send it along in a few days if Matthias or Rafael don't beat
> > me to it.
> >
> > In the meantime, I would be happy to add Jon's Reported-by and Tested-by,
> > along with Qiang's Acked-by or Reviewed-by.
>
>
> Thanks for the rapid response. Yes that does fix the problem for me so ...
>
> Tested-by: Jon Hunter <[email protected]>
Added, and again, thank you!
Thanx, Paul
On Fri, Jun 02, 2023 at 10:52:44AM +0800, Z qiang wrote:
> >
> > On Thu, Jun 01, 2023 at 08:33:10PM +0800, Z qiang wrote:
> > > >
> > > > Hi Paul,
> > > >
> > > > On 30/03/2023 23:47, Paul E. McKenney wrote:
> > > > > This commit moves the ->srcu_gp_seq, ->srcu_gp_seq_needed,
> > > > > ->srcu_gp_seq_needed_exp, ->srcu_gp_start, and ->srcu_last_gp_end fields
> > > > > from the srcu_struct structure to the srcu_usage structure to reduce
> > > > > the size of the former in order to improve cache locality.
> > > > >
> > > > > Suggested-by: Christoph Hellwig <[email protected]>
> > > > > Tested-by: Sachin Sant <[email protected]>
> > > > > Tested-by: "Zhang, Qiang1" <[email protected]>
> > > > > Signed-off-by: Paul E. McKenney <[email protected]>
> > > >
> > > >
> > > > I have noticed a suspend regression on some of our Tegra boards recently
> > > with v6.4-rc and interestingly bisect is pointing to this commit. I was
> > > unable revert this on top of the latest mainline but if I checkout this
> > > commit suspend fails and if I checkout the previous commit is passes.
> > > >
> > > > Enabling more debug I was able to capture the following crash log I see
> > > on one of the boards ...
> > > >
> > > > [ 57.327645] PM: suspend entry (deep)
> > > > [ 57.331660] Filesystems sync: 0.000 seconds
> > > > [ 57.340147] Freezing user space processes
> > > > [ 57.347470] Freezing user space processes completed (elapsed 0.007
> > > seconds)
> > > > [ 57.347501] OOM killer disabled.
> > > > [ 57.347508] Freezing remaining freezable tasks
> > > > [ 57.348834] Freezing remaining freezable tasks completed (elapsed
> > > 0.001 seconds)
> > > > [ 57.349932] 8<--- cut here ---
> > > > [ 57.349943] Unable to handle kernel NULL pointer dereference at
> > > virtual address 00000000 when write
> > > > [ 57.349960] [00000000] *pgd=00000000
> > > > [ 57.349986] Internal error: Oops: 805 [#1] PREEMPT SMP ARM
> > > > [ 57.350007] Modules linked in: tegra30_tsensor
> > > > [ 57.350033] CPU: 0 PID: 589 Comm: rtcwake Not tainted
> > > 6.3.0-rc1-00011-g03200b5ca3b4-dirty #3
> > > > [ 57.350057] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
> > > > [ 57.350067] PC is at rcu_segcblist_enqueue+0x2c/0x38
> > > > [ 57.350120] LR is at srcu_gp_start_if_needed+0xe4/0x544
> > > > [ 57.350169] pc : [<c01a5120>] lr : [<c0198b5c>] psr: a0070093
> > > > [ 57.350183] sp : f0b2dd20 ip : 3b5870ef fp : 00000000
> > > > [ 57.350194] r10: ef787d84 r9 : 00000000 r8 : ef787d80
> > > > [ 57.350205] r7 : 80070013 r6 : c131ec30 r5 : ef787d40 r4 : f0b2dd64
> > > > [ 57.350217] r3 : 00000000 r2 : 00000000 r1 : f0b2dd64 r0 : ef787d84
> > > > [ 57.350230] Flags: NzCv IRQs off FIQs on Mode SVC_32 ISA ARM
> > > Segment none
> > > > [ 57.350251] Control: 10c5387d Table: 81d8004a DAC: 00000051
> > > > [ 57.350261] Register r0 information: non-slab/vmalloc memory
> > > > [ 57.350283] Register r1 information: 2-page vmalloc region starting at
> > > 0xf0b2c000 allocated at kernel_clone+0xb4/0x3e4
> > > > [ 57.350322] Register r2 information: NULL pointer
> > > > [ 57.350337] Register r3 information: NULL pointer
> > > > [ 57.350350] Register r4 information: 2-page vmalloc region starting at
> > > 0xf0b2c000 allocated at kernel_clone+0xb4/0x3e4
> > > > [ 57.350379] Register r5 information: non-slab/vmalloc memory
> > > > [ 57.350394] Register r6 information: non-slab/vmalloc memory
> > > > [ 57.350408] Register r7 information: non-paged memory
> > > > [ 57.350422] Register r8 information: non-slab/vmalloc memory
> > > > [ 57.350436] Register r9 information: NULL pointer
> > > > [ 57.350449] Register r10 information: non-slab/vmalloc memory
> > > > [ 57.350463] Register r11 information: NULL pointer
> > > > [ 57.350477] Register r12 information: non-paged memory
> > > > [ 57.350491] Process rtcwake (pid: 589, stack limit = 0x410bb531)
> > > > [ 57.350510] Stack: (0xf0b2dd20 to 0xf0b2e000)
> > > > [ 57.350534] dd20: 00000000 c1ee4a40 f0b2dd7c c0184f24 ef781495
> > > 3b5870ef c1ee4a40 c1ee4a40
> > > > [ 57.350555] dd40: c131ec30 00000000 00000002 c0f3d1fc c3542ac0
> > > c2abbb10 c1ee4a40 c0199044
> > > > [ 57.350574] dd60: 60070013 00000000 c0195924 00000000 00000000
> > > f0b2dd74 f0b2dd74 3b5870ef
> > > > [ 57.350592] dd80: 00000000 c131ebc0 c120ab28 c0146d9c c2785b94
> > > c2785b40 c0fee9f4 c0872590
> > > > [ 57.350611] dda0: c2785b40 c08c39cc c2785b40 c08c3a3c c2788c00
> > > c08c40b0 c0f3d1fc c066f028
> > > > [ 57.350630] ddc0: f0b2de14 c2788c00 c1325ef4 c08c1d40 c2788c00
> > > c1325ef4 c0fee9f4 c08c31cc
> > > > [ 57.350648] dde0: c13708a0 0000000d 00000000 c0681c10 c16afe84
> > > 00000002 56508788 0000000d
> > > > [ 57.350665] de00: 00000002 c13708a0 10624dd3 56409580 0000000d
> > > 00000000 00000002 c0f3d1fc
> > > > [ 57.350685] de20: c3542ac0 c2abbb10 c1ee4a40 c06824e4 00000000
> > > ffffa900 00000000 c1386510
> > > > [ 57.350703] de40: 00000003 00000003 c1204f75 c017e8a8 c3542ac0
> > > c2abbb10 00428228 c0171574
> > > > [ 57.350721] de60: 00000000 00000000 00000003 3b5870ef c1204f75
> > > 00000000 00000003 c137aeb4
> > > > [ 57.350739] de80: c1204f75 c0f3d1fc c3542ac0 c2abbb10 00428228
> > > c017f380 00000003 c0f38a54
> > > > [ 57.350757] dea0: 00000003 c1386524 00000004 c017d708 00000004
> > > c2abbb00 00000000 00000000
> > > > [ 57.350775] dec0: c3542ac0 f0b2df28 c2abbb10 c03305b4 00000000
> > > 00000000 c2953c00 c1ee4a40
> > > > [ 57.350794] dee0: 00429438 00000004 c0d18488 00004004 00000000
> > > c02b1094 00000a55 c1d80010
> > > > [ 57.350812] df00: c1d80010 00000000 00000000 f0b2df78 01010006
> > > 00000004 00000000 00429438
> > > > [ 57.350830] df20: 00000000 00000000 c2953c00 00000000 00000000
> > > 00000000 00000000 00000000
> > > > [ 57.350848] df40: 00000000 00004004 00000000 00000000 0000006c
> > > 3b5870ef c2953c00 c2953c00
> > > > [ 57.350866] df60: 00000000 00000000 c1ee4a40 00429438 00000004
> > > c02b12c8 00000000 00000000
> > > > [ 57.350885] df80: 00001008 3b5870ef 0000006c 00429438 00428228
> > > 00000004 c0100324 c1ee4a40
> > > > [ 57.350902] dfa0: 00000004 c01000c0 0000006c 00429438 00000004
> > > 00429438 00000004 00000000
> > > > [ 57.350920] dfc0: 0000006c 00429438 00428228 00000004 00000004
> > > 00000004 0041578c 00428228
> > > > [ 57.350938] dfe0: 00000004 becda9a8 b6e9bc0b b6e26206 600f0030
> > > 00000004 00000000 00000000
> > > > [ 57.350960] rcu_segcblist_enqueue from
> > > srcu_gp_start_if_needed+0xe4/0x544
> > > > [ 57.351023] srcu_gp_start_if_needed from
> > > __synchronize_srcu.part.6+0x70/0x98
> > > > [ 57.351084] __synchronize_srcu.part.6 from
> > > srcu_notifier_chain_unregister+0x6c/0xdc
> > > > [ 57.351155] srcu_notifier_chain_unregister from
> > > cpufreq_unregister_notifier+0x60/0xbc
> > > > [ 57.351215] cpufreq_unregister_notifier from
> > > tegra_actmon_pause.part.0+0x1c/0x54
> > > > [ 57.351277] tegra_actmon_pause.part.0 from tegra_actmon_stop+0x38/0x3c
> > > > [ 57.351324] tegra_actmon_stop from
> > > tegra_governor_event_handler+0x100/0x11c
> > > > [ 57.351373] tegra_governor_event_handler from
> > > devfreq_suspend_device+0x64/0xac
> > > > [ 57.351423] devfreq_suspend_device from devfreq_suspend+0x30/0x64
> > > > [ 57.351467] devfreq_suspend from dpm_suspend+0x34/0x33c
> > > > [ 57.351506] dpm_suspend from dpm_suspend_start+0x90/0x98
> > > > [ 57.351528] dpm_suspend_start from
> > > suspend_devices_and_enter+0xe4/0x93c
> > > > [ 57.351573] suspend_devices_and_enter from pm_suspend+0x280/0x3ac
> > > > [ 57.351614] pm_suspend from state_store+0x6c/0xc8
> > > > [ 57.351654] state_store from kernfs_fop_write_iter+0x118/0x1b4
> > > > [ 57.351696] kernfs_fop_write_iter from vfs_write+0x314/0x3d4
> > > > [ 57.351733] vfs_write from ksys_write+0xa0/0xd0
> > > > [ 57.351760] ksys_write from ret_fast_syscall+0x0/0x54
> > > > [ 57.351788] Exception stack(0xf0b2dfa8 to 0xf0b2dff0)
> > > > [ 57.351809] dfa0: 0000006c 00429438 00000004
> > > 00429438 00000004 00000000
> > > > [ 57.351828] dfc0: 0000006c 00429438 00428228 00000004 00000004
> > > 00000004 0041578c 00428228
> > > > [ 57.351843] dfe0: 00000004 becda9a8 b6e9bc0b b6e26206
> > > > [ 57.351863] Code: e2833001 e5803034 e5812000 e5903010 (e5831000)
> > > > [ 57.351875] ---[ end trace 0000000000000000 ]---
> > > >
> > > >
> > > > I have not dug into this yet and so wanted to see if you have any
> > > thoughts on this?
> > > >
> > >
> > > Hi, Jon
> > >
> > > Please try it:
> > >
> > > diff --git a/include/linux/notifier.h b/include/linux/notifier.h
> > > index 2aba75145144..3ce6b59e02e5 100644
> > > --- a/include/linux/notifier.h
> > > +++ b/include/linux/notifier.h
> > > @@ -110,6 +110,7 @@ extern void srcu_init_notifier_head(struct
> > > srcu_notifier_head *nh);
> > > { \
> > > .mutex = __MUTEX_INITIALIZER(name.mutex), \
> > > .head = NULL, \
> > > + .srcuu = __SRCU_USAGE_INIT(name.srcuu), \
> > > .srcu = __SRCU_STRUCT_INIT(name.srcu, name.srcuu, pcpu), \
> > > }
> >
> > Thank you both!
> >
> > Huh. It looks like Chen-Yu Tsai sent a patch to this effect and
> > AngeloGioacchino Del Regno tested it. No one has picked it up yet.
> >
> > https://lore.kernel.org/all/[email protected]/
> >
> > This is clearly a regression, and I don't see it in -next. I will pick
> > it up and send it along in a few days if Matthias or Rafael don't beat
> > me to it.
> >
> > In the meantime, I would be happy to add Jon's Reported-by and Tested-by,
> > along with Qiang's Acked-by or Reviewed-by.
>
> Acked-by: Zqiang <[email protected]>
Thank you! I will apply this on my next rebase.
Thanx, Paul
>
> On Thu, Jun 01, 2023 at 08:33:10PM +0800, Z qiang wrote:
> > >
> > > Hi Paul,
> > >
> > > On 30/03/2023 23:47, Paul E. McKenney wrote:
> > > > This commit moves the ->srcu_gp_seq, ->srcu_gp_seq_needed,
> > > > ->srcu_gp_seq_needed_exp, ->srcu_gp_start, and ->srcu_last_gp_end fields
> > > > from the srcu_struct structure to the srcu_usage structure to reduce
> > > > the size of the former in order to improve cache locality.
> > > >
> > > > Suggested-by: Christoph Hellwig <[email protected]>
> > > > Tested-by: Sachin Sant <[email protected]>
> > > > Tested-by: "Zhang, Qiang1" <[email protected]>
> > > > Signed-off-by: Paul E. McKenney <[email protected]>
> > >
> > >
> > > I have noticed a suspend regression on some of our Tegra boards recently
> > with v6.4-rc and interestingly bisect is pointing to this commit. I was
> > unable revert this on top of the latest mainline but if I checkout this
> > commit suspend fails and if I checkout the previous commit is passes.
> > >
> > > Enabling more debug I was able to capture the following crash log I see
> > on one of the boards ...
> > >
> > > [ 57.327645] PM: suspend entry (deep)
> > > [ 57.331660] Filesystems sync: 0.000 seconds
> > > [ 57.340147] Freezing user space processes
> > > [ 57.347470] Freezing user space processes completed (elapsed 0.007
> > seconds)
> > > [ 57.347501] OOM killer disabled.
> > > [ 57.347508] Freezing remaining freezable tasks
> > > [ 57.348834] Freezing remaining freezable tasks completed (elapsed
> > 0.001 seconds)
> > > [ 57.349932] 8<--- cut here ---
> > > [ 57.349943] Unable to handle kernel NULL pointer dereference at
> > virtual address 00000000 when write
> > > [ 57.349960] [00000000] *pgd=00000000
> > > [ 57.349986] Internal error: Oops: 805 [#1] PREEMPT SMP ARM
> > > [ 57.350007] Modules linked in: tegra30_tsensor
> > > [ 57.350033] CPU: 0 PID: 589 Comm: rtcwake Not tainted
> > 6.3.0-rc1-00011-g03200b5ca3b4-dirty #3
> > > [ 57.350057] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
> > > [ 57.350067] PC is at rcu_segcblist_enqueue+0x2c/0x38
> > > [ 57.350120] LR is at srcu_gp_start_if_needed+0xe4/0x544
> > > [ 57.350169] pc : [<c01a5120>] lr : [<c0198b5c>] psr: a0070093
> > > [ 57.350183] sp : f0b2dd20 ip : 3b5870ef fp : 00000000
> > > [ 57.350194] r10: ef787d84 r9 : 00000000 r8 : ef787d80
> > > [ 57.350205] r7 : 80070013 r6 : c131ec30 r5 : ef787d40 r4 : f0b2dd64
> > > [ 57.350217] r3 : 00000000 r2 : 00000000 r1 : f0b2dd64 r0 : ef787d84
> > > [ 57.350230] Flags: NzCv IRQs off FIQs on Mode SVC_32 ISA ARM
> > Segment none
> > > [ 57.350251] Control: 10c5387d Table: 81d8004a DAC: 00000051
> > > [ 57.350261] Register r0 information: non-slab/vmalloc memory
> > > [ 57.350283] Register r1 information: 2-page vmalloc region starting at
> > 0xf0b2c000 allocated at kernel_clone+0xb4/0x3e4
> > > [ 57.350322] Register r2 information: NULL pointer
> > > [ 57.350337] Register r3 information: NULL pointer
> > > [ 57.350350] Register r4 information: 2-page vmalloc region starting at
> > 0xf0b2c000 allocated at kernel_clone+0xb4/0x3e4
> > > [ 57.350379] Register r5 information: non-slab/vmalloc memory
> > > [ 57.350394] Register r6 information: non-slab/vmalloc memory
> > > [ 57.350408] Register r7 information: non-paged memory
> > > [ 57.350422] Register r8 information: non-slab/vmalloc memory
> > > [ 57.350436] Register r9 information: NULL pointer
> > > [ 57.350449] Register r10 information: non-slab/vmalloc memory
> > > [ 57.350463] Register r11 information: NULL pointer
> > > [ 57.350477] Register r12 information: non-paged memory
> > > [ 57.350491] Process rtcwake (pid: 589, stack limit = 0x410bb531)
> > > [ 57.350510] Stack: (0xf0b2dd20 to 0xf0b2e000)
> > > [ 57.350534] dd20: 00000000 c1ee4a40 f0b2dd7c c0184f24 ef781495
> > 3b5870ef c1ee4a40 c1ee4a40
> > > [ 57.350555] dd40: c131ec30 00000000 00000002 c0f3d1fc c3542ac0
> > c2abbb10 c1ee4a40 c0199044
> > > [ 57.350574] dd60: 60070013 00000000 c0195924 00000000 00000000
> > f0b2dd74 f0b2dd74 3b5870ef
> > > [ 57.350592] dd80: 00000000 c131ebc0 c120ab28 c0146d9c c2785b94
> > c2785b40 c0fee9f4 c0872590
> > > [ 57.350611] dda0: c2785b40 c08c39cc c2785b40 c08c3a3c c2788c00
> > c08c40b0 c0f3d1fc c066f028
> > > [ 57.350630] ddc0: f0b2de14 c2788c00 c1325ef4 c08c1d40 c2788c00
> > c1325ef4 c0fee9f4 c08c31cc
> > > [ 57.350648] dde0: c13708a0 0000000d 00000000 c0681c10 c16afe84
> > 00000002 56508788 0000000d
> > > [ 57.350665] de00: 00000002 c13708a0 10624dd3 56409580 0000000d
> > 00000000 00000002 c0f3d1fc
> > > [ 57.350685] de20: c3542ac0 c2abbb10 c1ee4a40 c06824e4 00000000
> > ffffa900 00000000 c1386510
> > > [ 57.350703] de40: 00000003 00000003 c1204f75 c017e8a8 c3542ac0
> > c2abbb10 00428228 c0171574
> > > [ 57.350721] de60: 00000000 00000000 00000003 3b5870ef c1204f75
> > 00000000 00000003 c137aeb4
> > > [ 57.350739] de80: c1204f75 c0f3d1fc c3542ac0 c2abbb10 00428228
> > c017f380 00000003 c0f38a54
> > > [ 57.350757] dea0: 00000003 c1386524 00000004 c017d708 00000004
> > c2abbb00 00000000 00000000
> > > [ 57.350775] dec0: c3542ac0 f0b2df28 c2abbb10 c03305b4 00000000
> > 00000000 c2953c00 c1ee4a40
> > > [ 57.350794] dee0: 00429438 00000004 c0d18488 00004004 00000000
> > c02b1094 00000a55 c1d80010
> > > [ 57.350812] df00: c1d80010 00000000 00000000 f0b2df78 01010006
> > 00000004 00000000 00429438
> > > [ 57.350830] df20: 00000000 00000000 c2953c00 00000000 00000000
> > 00000000 00000000 00000000
> > > [ 57.350848] df40: 00000000 00004004 00000000 00000000 0000006c
> > 3b5870ef c2953c00 c2953c00
> > > [ 57.350866] df60: 00000000 00000000 c1ee4a40 00429438 00000004
> > c02b12c8 00000000 00000000
> > > [ 57.350885] df80: 00001008 3b5870ef 0000006c 00429438 00428228
> > 00000004 c0100324 c1ee4a40
> > > [ 57.350902] dfa0: 00000004 c01000c0 0000006c 00429438 00000004
> > 00429438 00000004 00000000
> > > [ 57.350920] dfc0: 0000006c 00429438 00428228 00000004 00000004
> > 00000004 0041578c 00428228
> > > [ 57.350938] dfe0: 00000004 becda9a8 b6e9bc0b b6e26206 600f0030
> > 00000004 00000000 00000000
> > > [ 57.350960] rcu_segcblist_enqueue from
> > srcu_gp_start_if_needed+0xe4/0x544
> > > [ 57.351023] srcu_gp_start_if_needed from
> > __synchronize_srcu.part.6+0x70/0x98
> > > [ 57.351084] __synchronize_srcu.part.6 from
> > srcu_notifier_chain_unregister+0x6c/0xdc
> > > [ 57.351155] srcu_notifier_chain_unregister from
> > cpufreq_unregister_notifier+0x60/0xbc
> > > [ 57.351215] cpufreq_unregister_notifier from
> > tegra_actmon_pause.part.0+0x1c/0x54
> > > [ 57.351277] tegra_actmon_pause.part.0 from tegra_actmon_stop+0x38/0x3c
> > > [ 57.351324] tegra_actmon_stop from
> > tegra_governor_event_handler+0x100/0x11c
> > > [ 57.351373] tegra_governor_event_handler from
> > devfreq_suspend_device+0x64/0xac
> > > [ 57.351423] devfreq_suspend_device from devfreq_suspend+0x30/0x64
> > > [ 57.351467] devfreq_suspend from dpm_suspend+0x34/0x33c
> > > [ 57.351506] dpm_suspend from dpm_suspend_start+0x90/0x98
> > > [ 57.351528] dpm_suspend_start from
> > suspend_devices_and_enter+0xe4/0x93c
> > > [ 57.351573] suspend_devices_and_enter from pm_suspend+0x280/0x3ac
> > > [ 57.351614] pm_suspend from state_store+0x6c/0xc8
> > > [ 57.351654] state_store from kernfs_fop_write_iter+0x118/0x1b4
> > > [ 57.351696] kernfs_fop_write_iter from vfs_write+0x314/0x3d4
> > > [ 57.351733] vfs_write from ksys_write+0xa0/0xd0
> > > [ 57.351760] ksys_write from ret_fast_syscall+0x0/0x54
> > > [ 57.351788] Exception stack(0xf0b2dfa8 to 0xf0b2dff0)
> > > [ 57.351809] dfa0: 0000006c 00429438 00000004
> > 00429438 00000004 00000000
> > > [ 57.351828] dfc0: 0000006c 00429438 00428228 00000004 00000004
> > 00000004 0041578c 00428228
> > > [ 57.351843] dfe0: 00000004 becda9a8 b6e9bc0b b6e26206
> > > [ 57.351863] Code: e2833001 e5803034 e5812000 e5903010 (e5831000)
> > > [ 57.351875] ---[ end trace 0000000000000000 ]---
> > >
> > >
> > > I have not dug into this yet and so wanted to see if you have any
> > thoughts on this?
> > >
> >
> > Hi, Jon
> >
> > Please try it:
> >
> > diff --git a/include/linux/notifier.h b/include/linux/notifier.h
> > index 2aba75145144..3ce6b59e02e5 100644
> > --- a/include/linux/notifier.h
> > +++ b/include/linux/notifier.h
> > @@ -110,6 +110,7 @@ extern void srcu_init_notifier_head(struct
> > srcu_notifier_head *nh);
> > { \
> > .mutex = __MUTEX_INITIALIZER(name.mutex), \
> > .head = NULL, \
> > + .srcuu = __SRCU_USAGE_INIT(name.srcuu), \
> > .srcu = __SRCU_STRUCT_INIT(name.srcu, name.srcuu, pcpu), \
> > }
>
> Thank you both!
>
> Huh. It looks like Chen-Yu Tsai sent a patch to this effect and
> AngeloGioacchino Del Regno tested it. No one has picked it up yet.
>
> https://lore.kernel.org/all/[email protected]/
>
> This is clearly a regression, and I don't see it in -next. I will pick
> it up and send it along in a few days if Matthias or Rafael don't beat
> me to it.
>
> In the meantime, I would be happy to add Jon's Reported-by and Tested-by,
> along with Qiang's Acked-by or Reviewed-by.
>
Acked-by: Zqiang <[email protected]>
Thanks
Zqiang
>
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> commit bf7da55fcdd1478839b21697cf0534be1f149a49
> Author: Chen-Yu Tsai <[email protected]>
> Date: Fri May 26 15:35:37 2023 +0800
>
> notifier: Initialize new struct srcu_usage field
>
> In commit 95433f726301 ("srcu: Begin offloading srcu_struct fields to
> srcu_update"), a new struct srcu_usage field was added, but was not
> properly initialized. This led to a "spinlock bad magic" BUG when the
> SRCU notifier was ever used. This was observed in the MediaTek CCI
> devfreq driver on next-20230525. The trimmed stack trace is as follows:
>
> BUG: spinlock bad magic on CPU#4, swapper/0/1
> lock: 0xffffff80ff529ac0, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
> Call trace:
> spin_bug+0xa4/0xe8
> do_raw_spin_lock+0xec/0x120
> _raw_spin_lock_irqsave+0x78/0xb8
> synchronize_srcu+0x3c/0x168
> srcu_notifier_chain_unregister+0x5c/0xa0
> cpufreq_unregister_notifier+0x94/0xe0
> devfreq_passive_event_handler+0x7c/0x3e0
> devfreq_remove_device+0x48/0xe8
>
> Add __SRCU_USAGE_INIT() to SRCU_NOTIFIER_INIT() so that srcu_usage gets
> initialized properly.
>
> Fixes: 95433f726301 ("srcu: Begin offloading srcu_struct fields to srcu_update")
> Signed-off-by: Chen-Yu Tsai <[email protected]>
> Tested-by: AngeloGioacchino Del Regno <[email protected]>
> Cc: Matthias Brugger <[email protected]>
> Cc: "Rafael J. Wysocki" <[email protected]>
> Cc: "Michał Mirosław" <[email protected]>
> Cc: Dmitry Osipenko <[email protected]>
> Cc: Sachin Sant <[email protected]>
> Cc: "Zhang, Qiang1" <[email protected]>
> Cc: Joel Fernandes (Google) <[email protected]
> Cc: Jon Hunter <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>
>
> diff --git a/include/linux/notifier.h b/include/linux/notifier.h
> index 2aba75145144..86544707236a 100644
> --- a/include/linux/notifier.h
> +++ b/include/linux/notifier.h
> @@ -106,12 +106,22 @@ extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
> #define RAW_NOTIFIER_INIT(name) { \
> .head = NULL }
>
> +#ifdef CONFIG_TREE_SRCU
> #define SRCU_NOTIFIER_INIT(name, pcpu) \
> { \
> .mutex = __MUTEX_INITIALIZER(name.mutex), \
> .head = NULL, \
> + .srcuu = __SRCU_USAGE_INIT(name.srcuu), \
> .srcu = __SRCU_STRUCT_INIT(name.srcu, name.srcuu, pcpu), \
> }
> +#else
> +#define SRCU_NOTIFIER_INIT(name, pcpu) \
> + { \
> + .mutex = __MUTEX_INITIALIZER(name.mutex), \
> + .head = NULL, \
> + .srcu = __SRCU_STRUCT_INIT(name.srcu, name.srcuu, pcpu), \
> + }
> +#endif
>
> #define ATOMIC_NOTIFIER_HEAD(name) \
> struct atomic_notifier_head name = \
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]
[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]
On 01.06.23 13:14, Jon Hunter wrote:
>
> On 30/03/2023 23:47, Paul E. McKenney wrote:
>> This commit moves the ->srcu_gp_seq, ->srcu_gp_seq_needed,
>> ->srcu_gp_seq_needed_exp, ->srcu_gp_start, and ->srcu_last_gp_end fields
>> from the srcu_struct structure to the srcu_usage structure to reduce
>> the size of the former in order to improve cache locality.
>>
>> Suggested-by: Christoph Hellwig <[email protected]>
>> Tested-by: Sachin Sant <[email protected]>
>> Tested-by: "Zhang, Qiang1" <[email protected]>
>> Signed-off-by: Paul E. McKenney <[email protected]>
>
>
> I have noticed a suspend regression on some of our Tegra boards recently
> with v6.4-rc and interestingly bisect is pointing to this commit. I was
> unable revert this on top of the latest mainline but if I checkout this
> commit suspend fails and if I checkout the previous commit is passes.
Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:
#regzbot ^introduced 95433f726301
#regzbot title rcu: "spinlock bad magic" BUG when the SRCU notifier was
ever used
#regzbot monitor:
https://lore.kernel.org/all/[email protected]/
#regzbot fix: notifier: Initialize new struct srcu_usage field
#regzbot ignore-activity
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.