When running the 'kfree_rcu_test' test case with commands [1] the call
trace [2] was thrown. This was because the kfree_scale_thread thread(s)
still run after unloading rcuscale and torture modules. Fix the call
trace by invoking kfree_scale_cleanup() when removing the rcuscale module.
[1] modprobe torture
modprobe rcuscale kfree_rcu_test=1
// After some time
rmmod rcuscale
rmmod torture
[2] BUG: unable to handle page fault for address: ffffffffc0601a87
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 11de4f067 P4D 11de4f067 PUD 11de51067 PMD 112f4d067 PTE 0
Oops: 0010 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 1798 Comm: kfree_scale_thr Not tainted 6.3.0-rc1-rcu+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
RIP: 0010:0xffffffffc0601a87
Code: Unable to access opcode bytes at 0xffffffffc0601a5d.
RSP: 0018:ffffb25bc2e57e18 EFLAGS: 00010297
RAX: 0000000000000000 RBX: ffffffffc061f0b6 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff962fd0de RDI: ffffffff962fd0de
RBP: ffffb25bc2e57ea8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 000000000000000a R15: 00000000001c1dbe
FS: 0000000000000000(0000) GS:ffff921fa2200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffc0601a5d CR3: 000000011de4c006 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? kvfree_call_rcu+0xf0/0x3a0
? kthread+0xf3/0x120
? kthread_complete_and_exit+0x20/0x20
? ret_from_fork+0x1f/0x30
</TASK>
Modules linked in: rfkill sunrpc ... [last unloaded: torture]
CR2: ffffffffc0601a87
---[ end trace 0000000000000000 ]---
Fixes: e6e78b004fa7 ("rcuperf: Add kfree_rcu() performance Tests")
Signed-off-by: Qiuxu Zhuo <[email protected]>
---
kernel/rcu/rcuscale.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index 91fb5905a008..5e580cd08c58 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -522,6 +522,8 @@ rcu_scale_print_module_parms(struct rcu_scale_ops *cur_ops, const char *tag)
scale_type, tag, nrealreaders, nrealwriters, verbose, shutdown);
}
+static void kfree_scale_cleanup(void);
+
static void
rcu_scale_cleanup(void)
{
@@ -542,6 +544,11 @@ rcu_scale_cleanup(void)
if (gp_exp && gp_async)
SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
+ if (kfree_rcu_test) {
+ kfree_scale_cleanup();
+ return;
+ }
+
if (torture_cleanup_begin())
return;
if (!cur_ops) {
--
2.17.1
On Mon, Mar 13, 2023 at 04:04:03PM +0800, Qiuxu Zhuo wrote:
> When running the 'kfree_rcu_test' test case with commands [1] the call
> trace [2] was thrown. This was because the kfree_scale_thread thread(s)
> still run after unloading rcuscale and torture modules. Fix the call
> trace by invoking kfree_scale_cleanup() when removing the rcuscale module.
Good catch, thank you!
> [1] modprobe torture
> modprobe rcuscale kfree_rcu_test=1
Given that loading the rcuscale kernel module automatically pulls in
the rcuscale kernel module, why the "modprobe torture"? Is doing the
modprobes separately like this necessary to reproduce this bug?
If it reproduces without manually loading and unloading the "torture"
kernel module, could you please update the commit log to show that
smaller reproducer?
> // After some time
> rmmod rcuscale
> rmmod torture
>
> [2] BUG: unable to handle page fault for address: ffffffffc0601a87
> #PF: supervisor instruction fetch in kernel mode
> #PF: error_code(0x0010) - not-present page
> PGD 11de4f067 P4D 11de4f067 PUD 11de51067 PMD 112f4d067 PTE 0
> Oops: 0010 [#1] PREEMPT SMP NOPTI
> CPU: 1 PID: 1798 Comm: kfree_scale_thr Not tainted 6.3.0-rc1-rcu+ #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
> RIP: 0010:0xffffffffc0601a87
> Code: Unable to access opcode bytes at 0xffffffffc0601a5d.
> RSP: 0018:ffffb25bc2e57e18 EFLAGS: 00010297
> RAX: 0000000000000000 RBX: ffffffffc061f0b6 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffffff962fd0de RDI: ffffffff962fd0de
> RBP: ffffb25bc2e57ea8 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
> R13: 0000000000000000 R14: 000000000000000a R15: 00000000001c1dbe
> FS: 0000000000000000(0000) GS:ffff921fa2200000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffc0601a5d CR3: 000000011de4c006 CR4: 0000000000370ee0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> ? kvfree_call_rcu+0xf0/0x3a0
> ? kthread+0xf3/0x120
> ? kthread_complete_and_exit+0x20/0x20
> ? ret_from_fork+0x1f/0x30
> </TASK>
> Modules linked in: rfkill sunrpc ... [last unloaded: torture]
> CR2: ffffffffc0601a87
> ---[ end trace 0000000000000000 ]---
>
> Fixes: e6e78b004fa7 ("rcuperf: Add kfree_rcu() performance Tests")
> Signed-off-by: Qiuxu Zhuo <[email protected]>
> ---
> kernel/rcu/rcuscale.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
> index 91fb5905a008..5e580cd08c58 100644
> --- a/kernel/rcu/rcuscale.c
> +++ b/kernel/rcu/rcuscale.c
> @@ -522,6 +522,8 @@ rcu_scale_print_module_parms(struct rcu_scale_ops *cur_ops, const char *tag)
> scale_type, tag, nrealreaders, nrealwriters, verbose, shutdown);
> }
>
> +static void kfree_scale_cleanup(void);
> +
I do applaud minmimizing the size of the patch, but in this case could you
please pull the kfree_scale_cleanup() function ahead of its first use?
Thanx, Paul
> static void
> rcu_scale_cleanup(void)
> {
> @@ -542,6 +544,11 @@ rcu_scale_cleanup(void)
> if (gp_exp && gp_async)
> SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
>
> + if (kfree_rcu_test) {
> + kfree_scale_cleanup();
> + return;
> + }
> +
> if (torture_cleanup_begin())
> return;
> if (!cur_ops) {
> --
> 2.17.1
>
On Tue, Mar 14, 2023 at 7:02 PM Paul E. McKenney <[email protected]> wrote:
>
> On Mon, Mar 13, 2023 at 04:04:03PM +0800, Qiuxu Zhuo wrote:
> > When running the 'kfree_rcu_test' test case with commands [1] the call
> > trace [2] was thrown. This was because the kfree_scale_thread thread(s)
> > still run after unloading rcuscale and torture modules. Fix the call
> > trace by invoking kfree_scale_cleanup() when removing the rcuscale module.
>
> Good catch, thank you!
>
> > [1] modprobe torture
> > modprobe rcuscale kfree_rcu_test=1
>
> Given that loading the rcuscale kernel module automatically pulls in
> the rcuscale kernel module, why the "modprobe torture"? Is doing the
> modprobes separately like this necessary to reproduce this bug?
>
> If it reproduces without manually loading and unloading the "torture"
> kernel module, could you please update the commit log to show that
> smaller reproducer?
>
> > // After some time
> > rmmod rcuscale
> > rmmod torture
> >
> > [2] BUG: unable to handle page fault for address: ffffffffc0601a87
> > #PF: supervisor instruction fetch in kernel mode
> > #PF: error_code(0x0010) - not-present page
> > PGD 11de4f067 P4D 11de4f067 PUD 11de51067 PMD 112f4d067 PTE 0
> > Oops: 0010 [#1] PREEMPT SMP NOPTI
[..]
> > Call Trace:
> > <TASK>
> > ? kvfree_call_rcu+0xf0/0x3a0
> > ? kthread+0xf3/0x120
> > ? kthread_complete_and_exit+0x20/0x20
> > ? ret_from_fork+0x1f/0x30
> > </TASK>
> > Modules linked in: rfkill sunrpc ... [last unloaded: torture]
> > CR2: ffffffffc0601a87
> > ---[ end trace 0000000000000000 ]---
> >
> > Fixes: e6e78b004fa7 ("rcuperf: Add kfree_rcu() performance Tests")
> > Signed-off-by: Qiuxu Zhuo <[email protected]>
> > ---
> > kernel/rcu/rcuscale.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
> > index 91fb5905a008..5e580cd08c58 100644
> > --- a/kernel/rcu/rcuscale.c
> > +++ b/kernel/rcu/rcuscale.c
> > @@ -522,6 +522,8 @@ rcu_scale_print_module_parms(struct rcu_scale_ops *cur_ops, const char *tag)
> > scale_type, tag, nrealreaders, nrealwriters, verbose, shutdown);
> > }
> >
> > +static void kfree_scale_cleanup(void);
> > +
>
> I do applaud minmimizing the size of the patch, but in this case could you
> please pull the kfree_scale_cleanup() function ahead of its first use?
The only trouble with moving the function like that is, the file is
mostly split across kfree and non-kfree functions. So moving a kfree
function to be among the non-kfree ones would look a bit weird.
Perhaps a better place for the function declaration could be a new
"rcuscale.h". But I am really Ok with Paul's suggestion as well.
Reviewed-by: Joel Fernandes (Google) <[email protected]>
thanks,
- Joel
>
> Thanx, Paul
>
> > static void
> > rcu_scale_cleanup(void)
> > {
> > @@ -542,6 +544,11 @@ rcu_scale_cleanup(void)
> > if (gp_exp && gp_async)
> > SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
> >
> > + if (kfree_rcu_test) {
> > + kfree_scale_cleanup();
> > + return;
> > + }
> > +
> > if (torture_cleanup_begin())
> > return;
> > if (!cur_ops) {
> > --
> > 2.17.1
> >
> From: Paul E. McKenney <[email protected]>
> [...]
> Subject: Re: [PATCH 1/1] rcu/rcuscale: Stop kfree_scale_thread thread(s)
> after unloading rcuscale
>
> On Mon, Mar 13, 2023 at 04:04:03PM +0800, Qiuxu Zhuo wrote:
> > When running the 'kfree_rcu_test' test case with commands [1] the call
> > trace [2] was thrown. This was because the kfree_scale_thread
> > thread(s) still run after unloading rcuscale and torture modules. Fix
> > the call trace by invoking kfree_scale_cleanup() when removing the
> rcuscale module.
>
> Good catch, thank you!
>
> > [1] modprobe torture
> > modprobe rcuscale kfree_rcu_test=1
>
> Given that loading the rcuscale kernel module automatically pulls in the
> rcuscale kernel module, why the "modprobe torture"? Is doing the
Oops ...
I forgot that the system could automatically figure out and load dependent modules.
Thank you for pointing it out.
> modprobes separately like this necessary to reproduce this bug?
No. It isn't.
> If it reproduces without manually loading and unloading the "torture"
> kernel module, could you please update the commit log to show that smaller
> reproducer?
To reproduce the bug, it needs to manually unload the "torture" but doesn't need
to manually load "torture".
I'll remove the unnecessary step "modprobe torture" from the commit message in the v2 patch.
> > // After some time
> > rmmod rcuscale
> > rmmod torture
> >
[...]
> > ---
> > kernel/rcu/rcuscale.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c index
> > 91fb5905a008..5e580cd08c58 100644
> > --- a/kernel/rcu/rcuscale.c
> > +++ b/kernel/rcu/rcuscale.c
> > @@ -522,6 +522,8 @@ rcu_scale_print_module_parms(struct
> rcu_scale_ops *cur_ops, const char *tag)
> > scale_type, tag, nrealreaders, nrealwriters, verbose,
> shutdown);
> > }
> >
> > +static void kfree_scale_cleanup(void);
> > +
>
> I do applaud minmimizing the size of the patch, but in this case could you
> please pull the kfree_scale_cleanup() function ahead of its first use?
>
I thought about it, but was also concerned that would make the patch bigger
while the effective changes were just only a few lines ...
Pulling the kfree_scale_cleanup() function ahead of rcu_scale_cleanup() also needs
to pull another two kfree_* variables ahead used by kfree_scale_cleanup().
This looks like will mess up kfree_* and rcu_scale_* functions/variables in the source file.
How about to pull the rcu_scale_cleanup() function after kfree_scale_cleanup().
This groups kfree_* functions and groups rcu_scale_* functions.
Then the code would look cleaner.
So, do you think the changes below are better?
---
kernel/rcu/rcuscale.c | 201 ++++++++++++++++++++++--------------------
1 file changed, 103 insertions(+), 98 deletions(-)
diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index 91fb5905a008..5a000d26f03e 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -522,89 +522,6 @@ rcu_scale_print_module_parms(struct rcu_scale_ops *cur_ops, const char *tag)
scale_type, tag, nrealreaders, nrealwriters, verbose, shutdown);
}
-static void
-rcu_scale_cleanup(void)
-{
- int i;
- int j;
- int ngps = 0;
- u64 *wdp;
- u64 *wdpp;
-
- /*
- * Would like warning at start, but everything is expedited
- * during the mid-boot phase, so have to wait till the end.
- */
- if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp)
- SCALEOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
- if (rcu_gp_is_normal() && gp_exp)
- SCALEOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
- if (gp_exp && gp_async)
- SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
-
- if (torture_cleanup_begin())
- return;
- if (!cur_ops) {
- torture_cleanup_end();
- return;
- }
-
- if (reader_tasks) {
- for (i = 0; i < nrealreaders; i++)
- torture_stop_kthread(rcu_scale_reader,
- reader_tasks[i]);
- kfree(reader_tasks);
- }
-
- if (writer_tasks) {
- for (i = 0; i < nrealwriters; i++) {
- torture_stop_kthread(rcu_scale_writer,
- writer_tasks[i]);
- if (!writer_n_durations)
- continue;
- j = writer_n_durations[i];
- pr_alert("%s%s writer %d gps: %d\n",
- scale_type, SCALE_FLAG, i, j);
- ngps += j;
- }
- pr_alert("%s%s start: %llu end: %llu duration: %llu gps: %d batches: %ld\n",
- scale_type, SCALE_FLAG,
- t_rcu_scale_writer_started, t_rcu_scale_writer_finished,
- t_rcu_scale_writer_finished -
- t_rcu_scale_writer_started,
- ngps,
- rcuscale_seq_diff(b_rcu_gp_test_finished,
- b_rcu_gp_test_started));
- for (i = 0; i < nrealwriters; i++) {
- if (!writer_durations)
- break;
- if (!writer_n_durations)
- continue;
- wdpp = writer_durations[i];
- if (!wdpp)
- continue;
- for (j = 0; j < writer_n_durations[i]; j++) {
- wdp = &wdpp[j];
- pr_alert("%s%s %4d writer-duration: %5d %llu\n",
- scale_type, SCALE_FLAG,
- i, j, *wdp);
- if (j % 100 == 0)
- schedule_timeout_uninterruptible(1);
- }
- kfree(writer_durations[i]);
- }
- kfree(writer_tasks);
- kfree(writer_durations);
- kfree(writer_n_durations);
- }
-
- /* Do torture-type-specific cleanup operations. */
- if (cur_ops->cleanup != NULL)
- cur_ops->cleanup();
-
- torture_cleanup_end();
-}
-
/*
* Return the number if non-negative. If -1, the number of CPUs.
* If less than -1, that much less than the number of CPUs, but
@@ -624,21 +541,6 @@ static int compute_real(int n)
return nr;
}
-/*
- * RCU scalability shutdown kthread. Just waits to be awakened, then shuts
- * down system.
- */
-static int
-rcu_scale_shutdown(void *arg)
-{
- wait_event(shutdown_wq,
- atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters);
- smp_mb(); /* Wake before output. */
- rcu_scale_cleanup();
- kernel_power_off();
- return -EINVAL;
-}
-
/*
* kfree_rcu() scalability tests: Start a kfree_rcu() loop on all CPUs for number
* of iterations and measure total time and number of GP for all iterations to complete.
@@ -875,6 +777,109 @@ kfree_scale_init(void)
return firsterr;
}
+static void
+rcu_scale_cleanup(void)
+{
+ int i;
+ int j;
+ int ngps = 0;
+ u64 *wdp;
+ u64 *wdpp;
+
+ /*
+ * Would like warning at start, but everything is expedited
+ * during the mid-boot phase, so have to wait till the end.
+ */
+ if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp)
+ SCALEOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
+ if (rcu_gp_is_normal() && gp_exp)
+ SCALEOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
+ if (gp_exp && gp_async)
+ SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
+
+ if (kfree_rcu_test) {
+ kfree_scale_cleanup();
+ return;
+ }
+
+ if (torture_cleanup_begin())
+ return;
+ if (!cur_ops) {
+ torture_cleanup_end();
+ return;
+ }
+
+ if (reader_tasks) {
+ for (i = 0; i < nrealreaders; i++)
+ torture_stop_kthread(rcu_scale_reader,
+ reader_tasks[i]);
+ kfree(reader_tasks);
+ }
+
+ if (writer_tasks) {
+ for (i = 0; i < nrealwriters; i++) {
+ torture_stop_kthread(rcu_scale_writer,
+ writer_tasks[i]);
+ if (!writer_n_durations)
+ continue;
+ j = writer_n_durations[i];
+ pr_alert("%s%s writer %d gps: %d\n",
+ scale_type, SCALE_FLAG, i, j);
+ ngps += j;
+ }
+ pr_alert("%s%s start: %llu end: %llu duration: %llu gps: %d batches: %ld\n",
+ scale_type, SCALE_FLAG,
+ t_rcu_scale_writer_started, t_rcu_scale_writer_finished,
+ t_rcu_scale_writer_finished -
+ t_rcu_scale_writer_started,
+ ngps,
+ rcuscale_seq_diff(b_rcu_gp_test_finished,
+ b_rcu_gp_test_started));
+ for (i = 0; i < nrealwriters; i++) {
+ if (!writer_durations)
+ break;
+ if (!writer_n_durations)
+ continue;
+ wdpp = writer_durations[i];
+ if (!wdpp)
+ continue;
+ for (j = 0; j < writer_n_durations[i]; j++) {
+ wdp = &wdpp[j];
+ pr_alert("%s%s %4d writer-duration: %5d %llu\n",
+ scale_type, SCALE_FLAG,
+ i, j, *wdp);
+ if (j % 100 == 0)
+ schedule_timeout_uninterruptible(1);
+ }
+ kfree(writer_durations[i]);
+ }
+ kfree(writer_tasks);
+ kfree(writer_durations);
+ kfree(writer_n_durations);
+ }
+
+ /* Do torture-type-specific cleanup operations. */
+ if (cur_ops->cleanup != NULL)
+ cur_ops->cleanup();
+
+ torture_cleanup_end();
+}
+
+/*
+ * RCU scalability shutdown kthread. Just waits to be awakened, then shuts
+ * down system.
+ */
+static int
+rcu_scale_shutdown(void *arg)
+{
+ wait_event(shutdown_wq,
+ atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters);
+ smp_mb(); /* Wake before output. */
+ rcu_scale_cleanup();
+ kernel_power_off();
+ return -EINVAL;
+}
+
static int __init
rcu_scale_init(void)
{
--
2.17.1
> [...]
On Wed, Mar 15, 2023 at 10:07 AM Zhuo, Qiuxu <[email protected]> wrote:
[...]
> >
> > I do applaud minmimizing the size of the patch, but in this case could you
> > please pull the kfree_scale_cleanup() function ahead of its first use?
> >
>
> I thought about it, but was also concerned that would make the patch bigger
> while the effective changes were just only a few lines ...
>
> Pulling the kfree_scale_cleanup() function ahead of rcu_scale_cleanup() also needs
> to pull another two kfree_* variables ahead used by kfree_scale_cleanup().
> This looks like will mess up kfree_* and rcu_scale_* functions/variables in the source file.
>
> How about to pull the rcu_scale_cleanup() function after kfree_scale_cleanup().
> This groups kfree_* functions and groups rcu_scale_* functions.
> Then the code would look cleaner.
> So, do you think the changes below are better?
IMHO, I don't think doing such a code move is better. Just add a new
header file and declare the function there. But see what Paul says
first.
thanks,
- Joel
>
> ---
> kernel/rcu/rcuscale.c | 201 ++++++++++++++++++++++--------------------
> 1 file changed, 103 insertions(+), 98 deletions(-)
>
> diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
> index 91fb5905a008..5a000d26f03e 100644
> --- a/kernel/rcu/rcuscale.c
> +++ b/kernel/rcu/rcuscale.c
> @@ -522,89 +522,6 @@ rcu_scale_print_module_parms(struct rcu_scale_ops *cur_ops, const char *tag)
> scale_type, tag, nrealreaders, nrealwriters, verbose, shutdown);
> }
>
> -static void
> -rcu_scale_cleanup(void)
> -{
> - int i;
> - int j;
> - int ngps = 0;
> - u64 *wdp;
> - u64 *wdpp;
> -
> - /*
> - * Would like warning at start, but everything is expedited
> - * during the mid-boot phase, so have to wait till the end.
> - */
> - if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp)
> - SCALEOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
> - if (rcu_gp_is_normal() && gp_exp)
> - SCALEOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
> - if (gp_exp && gp_async)
> - SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
> -
> - if (torture_cleanup_begin())
> - return;
> - if (!cur_ops) {
> - torture_cleanup_end();
> - return;
> - }
> -
> - if (reader_tasks) {
> - for (i = 0; i < nrealreaders; i++)
> - torture_stop_kthread(rcu_scale_reader,
> - reader_tasks[i]);
> - kfree(reader_tasks);
> - }
> -
> - if (writer_tasks) {
> - for (i = 0; i < nrealwriters; i++) {
> - torture_stop_kthread(rcu_scale_writer,
> - writer_tasks[i]);
> - if (!writer_n_durations)
> - continue;
> - j = writer_n_durations[i];
> - pr_alert("%s%s writer %d gps: %d\n",
> - scale_type, SCALE_FLAG, i, j);
> - ngps += j;
> - }
> - pr_alert("%s%s start: %llu end: %llu duration: %llu gps: %d batches: %ld\n",
> - scale_type, SCALE_FLAG,
> - t_rcu_scale_writer_started, t_rcu_scale_writer_finished,
> - t_rcu_scale_writer_finished -
> - t_rcu_scale_writer_started,
> - ngps,
> - rcuscale_seq_diff(b_rcu_gp_test_finished,
> - b_rcu_gp_test_started));
> - for (i = 0; i < nrealwriters; i++) {
> - if (!writer_durations)
> - break;
> - if (!writer_n_durations)
> - continue;
> - wdpp = writer_durations[i];
> - if (!wdpp)
> - continue;
> - for (j = 0; j < writer_n_durations[i]; j++) {
> - wdp = &wdpp[j];
> - pr_alert("%s%s %4d writer-duration: %5d %llu\n",
> - scale_type, SCALE_FLAG,
> - i, j, *wdp);
> - if (j % 100 == 0)
> - schedule_timeout_uninterruptible(1);
> - }
> - kfree(writer_durations[i]);
> - }
> - kfree(writer_tasks);
> - kfree(writer_durations);
> - kfree(writer_n_durations);
> - }
> -
> - /* Do torture-type-specific cleanup operations. */
> - if (cur_ops->cleanup != NULL)
> - cur_ops->cleanup();
> -
> - torture_cleanup_end();
> -}
> -
> /*
> * Return the number if non-negative. If -1, the number of CPUs.
> * If less than -1, that much less than the number of CPUs, but
> @@ -624,21 +541,6 @@ static int compute_real(int n)
> return nr;
> }
>
> -/*
> - * RCU scalability shutdown kthread. Just waits to be awakened, then shuts
> - * down system.
> - */
> -static int
> -rcu_scale_shutdown(void *arg)
> -{
> - wait_event(shutdown_wq,
> - atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters);
> - smp_mb(); /* Wake before output. */
> - rcu_scale_cleanup();
> - kernel_power_off();
> - return -EINVAL;
> -}
> -
> /*
> * kfree_rcu() scalability tests: Start a kfree_rcu() loop on all CPUs for number
> * of iterations and measure total time and number of GP for all iterations to complete.
> @@ -875,6 +777,109 @@ kfree_scale_init(void)
> return firsterr;
> }
>
> +static void
> +rcu_scale_cleanup(void)
> +{
> + int i;
> + int j;
> + int ngps = 0;
> + u64 *wdp;
> + u64 *wdpp;
> +
> + /*
> + * Would like warning at start, but everything is expedited
> + * during the mid-boot phase, so have to wait till the end.
> + */
> + if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp)
> + SCALEOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
> + if (rcu_gp_is_normal() && gp_exp)
> + SCALEOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
> + if (gp_exp && gp_async)
> + SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
> +
> + if (kfree_rcu_test) {
> + kfree_scale_cleanup();
> + return;
> + }
> +
> + if (torture_cleanup_begin())
> + return;
> + if (!cur_ops) {
> + torture_cleanup_end();
> + return;
> + }
> +
> + if (reader_tasks) {
> + for (i = 0; i < nrealreaders; i++)
> + torture_stop_kthread(rcu_scale_reader,
> + reader_tasks[i]);
> + kfree(reader_tasks);
> + }
> +
> + if (writer_tasks) {
> + for (i = 0; i < nrealwriters; i++) {
> + torture_stop_kthread(rcu_scale_writer,
> + writer_tasks[i]);
> + if (!writer_n_durations)
> + continue;
> + j = writer_n_durations[i];
> + pr_alert("%s%s writer %d gps: %d\n",
> + scale_type, SCALE_FLAG, i, j);
> + ngps += j;
> + }
> + pr_alert("%s%s start: %llu end: %llu duration: %llu gps: %d batches: %ld\n",
> + scale_type, SCALE_FLAG,
> + t_rcu_scale_writer_started, t_rcu_scale_writer_finished,
> + t_rcu_scale_writer_finished -
> + t_rcu_scale_writer_started,
> + ngps,
> + rcuscale_seq_diff(b_rcu_gp_test_finished,
> + b_rcu_gp_test_started));
> + for (i = 0; i < nrealwriters; i++) {
> + if (!writer_durations)
> + break;
> + if (!writer_n_durations)
> + continue;
> + wdpp = writer_durations[i];
> + if (!wdpp)
> + continue;
> + for (j = 0; j < writer_n_durations[i]; j++) {
> + wdp = &wdpp[j];
> + pr_alert("%s%s %4d writer-duration: %5d %llu\n",
> + scale_type, SCALE_FLAG,
> + i, j, *wdp);
> + if (j % 100 == 0)
> + schedule_timeout_uninterruptible(1);
> + }
> + kfree(writer_durations[i]);
> + }
> + kfree(writer_tasks);
> + kfree(writer_durations);
> + kfree(writer_n_durations);
> + }
> +
> + /* Do torture-type-specific cleanup operations. */
> + if (cur_ops->cleanup != NULL)
> + cur_ops->cleanup();
> +
> + torture_cleanup_end();
> +}
> +
> +/*
> + * RCU scalability shutdown kthread. Just waits to be awakened, then shuts
> + * down system.
> + */
> +static int
> +rcu_scale_shutdown(void *arg)
> +{
> + wait_event(shutdown_wq,
> + atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters);
> + smp_mb(); /* Wake before output. */
> + rcu_scale_cleanup();
> + kernel_power_off();
> + return -EINVAL;
> +}
> +
> static int __init
> rcu_scale_init(void)
> {
> --
> 2.17.1
>
> > [...]
>
> From: Joel Fernandes <[email protected]>
> [...]
> > > kernel/rcu/rcuscale.c | 7 +++++++
> > > 1 file changed, 7 insertions(+)
> > >
> > > diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c index
> > > 91fb5905a008..5e580cd08c58 100644
> > > --- a/kernel/rcu/rcuscale.c
> > > +++ b/kernel/rcu/rcuscale.c
> > > @@ -522,6 +522,8 @@ rcu_scale_print_module_parms(struct
> rcu_scale_ops *cur_ops, const char *tag)
> > > scale_type, tag, nrealreaders, nrealwriters, verbose,
> > > shutdown); }
> > >
> > > +static void kfree_scale_cleanup(void);
> > > +
> >
> > I do applaud minmimizing the size of the patch, but in this case could
> > you please pull the kfree_scale_cleanup() function ahead of its first use?
>
> The only trouble with moving the function like that is, the file is mostly split
> across kfree and non-kfree functions. So moving a kfree function to be
> among the non-kfree ones would look a bit weird.
Yes, this would look a bit weird ...
Please see the reply to Paul in another e-mail:
"Pull the rcu_scale_cleanup() function after kfree_scale_cleanup().
This groups kfree_* functions and groups rcu_scale_* functions.
Then the code would look cleaner."
> Perhaps a better place for the function declaration could be a new
> "rcuscale.h". But I am really Ok with Paul's suggestion as well.
>
> Reviewed-by: Joel Fernandes (Google) <[email protected]>
Thanks for the review. :-)
> thanks,
>
> - Joel
On Wed, Mar 15, 2023 at 10:12:05AM -0400, Joel Fernandes wrote:
> On Wed, Mar 15, 2023 at 10:07 AM Zhuo, Qiuxu <[email protected]> wrote:
> [...]
> > >
> > > I do applaud minmimizing the size of the patch, but in this case could you
> > > please pull the kfree_scale_cleanup() function ahead of its first use?
> > >
> >
> > I thought about it, but was also concerned that would make the patch bigger
> > while the effective changes were just only a few lines ...
> >
> > Pulling the kfree_scale_cleanup() function ahead of rcu_scale_cleanup() also needs
> > to pull another two kfree_* variables ahead used by kfree_scale_cleanup().
> > This looks like will mess up kfree_* and rcu_scale_* functions/variables in the source file.
> >
> > How about to pull the rcu_scale_cleanup() function after kfree_scale_cleanup().
> > This groups kfree_* functions and groups rcu_scale_* functions.
> > Then the code would look cleaner.
> > So, do you think the changes below are better?
>
> IMHO, I don't think doing such a code move is better. Just add a new
> header file and declare the function there. But see what Paul says
> first.
This situation is likely to be an early hint that the kvfree_rcu() testing
should be split out from kernel/rcu/rcuscale.c.
Thanx, Paul
> thanks,
>
> - Joel
>
>
>
> >
> > ---
> > kernel/rcu/rcuscale.c | 201 ++++++++++++++++++++++--------------------
> > 1 file changed, 103 insertions(+), 98 deletions(-)
> >
> > diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
> > index 91fb5905a008..5a000d26f03e 100644
> > --- a/kernel/rcu/rcuscale.c
> > +++ b/kernel/rcu/rcuscale.c
> > @@ -522,89 +522,6 @@ rcu_scale_print_module_parms(struct rcu_scale_ops *cur_ops, const char *tag)
> > scale_type, tag, nrealreaders, nrealwriters, verbose, shutdown);
> > }
> >
> > -static void
> > -rcu_scale_cleanup(void)
> > -{
> > - int i;
> > - int j;
> > - int ngps = 0;
> > - u64 *wdp;
> > - u64 *wdpp;
> > -
> > - /*
> > - * Would like warning at start, but everything is expedited
> > - * during the mid-boot phase, so have to wait till the end.
> > - */
> > - if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp)
> > - SCALEOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
> > - if (rcu_gp_is_normal() && gp_exp)
> > - SCALEOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
> > - if (gp_exp && gp_async)
> > - SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
> > -
> > - if (torture_cleanup_begin())
> > - return;
> > - if (!cur_ops) {
> > - torture_cleanup_end();
> > - return;
> > - }
> > -
> > - if (reader_tasks) {
> > - for (i = 0; i < nrealreaders; i++)
> > - torture_stop_kthread(rcu_scale_reader,
> > - reader_tasks[i]);
> > - kfree(reader_tasks);
> > - }
> > -
> > - if (writer_tasks) {
> > - for (i = 0; i < nrealwriters; i++) {
> > - torture_stop_kthread(rcu_scale_writer,
> > - writer_tasks[i]);
> > - if (!writer_n_durations)
> > - continue;
> > - j = writer_n_durations[i];
> > - pr_alert("%s%s writer %d gps: %d\n",
> > - scale_type, SCALE_FLAG, i, j);
> > - ngps += j;
> > - }
> > - pr_alert("%s%s start: %llu end: %llu duration: %llu gps: %d batches: %ld\n",
> > - scale_type, SCALE_FLAG,
> > - t_rcu_scale_writer_started, t_rcu_scale_writer_finished,
> > - t_rcu_scale_writer_finished -
> > - t_rcu_scale_writer_started,
> > - ngps,
> > - rcuscale_seq_diff(b_rcu_gp_test_finished,
> > - b_rcu_gp_test_started));
> > - for (i = 0; i < nrealwriters; i++) {
> > - if (!writer_durations)
> > - break;
> > - if (!writer_n_durations)
> > - continue;
> > - wdpp = writer_durations[i];
> > - if (!wdpp)
> > - continue;
> > - for (j = 0; j < writer_n_durations[i]; j++) {
> > - wdp = &wdpp[j];
> > - pr_alert("%s%s %4d writer-duration: %5d %llu\n",
> > - scale_type, SCALE_FLAG,
> > - i, j, *wdp);
> > - if (j % 100 == 0)
> > - schedule_timeout_uninterruptible(1);
> > - }
> > - kfree(writer_durations[i]);
> > - }
> > - kfree(writer_tasks);
> > - kfree(writer_durations);
> > - kfree(writer_n_durations);
> > - }
> > -
> > - /* Do torture-type-specific cleanup operations. */
> > - if (cur_ops->cleanup != NULL)
> > - cur_ops->cleanup();
> > -
> > - torture_cleanup_end();
> > -}
> > -
> > /*
> > * Return the number if non-negative. If -1, the number of CPUs.
> > * If less than -1, that much less than the number of CPUs, but
> > @@ -624,21 +541,6 @@ static int compute_real(int n)
> > return nr;
> > }
> >
> > -/*
> > - * RCU scalability shutdown kthread. Just waits to be awakened, then shuts
> > - * down system.
> > - */
> > -static int
> > -rcu_scale_shutdown(void *arg)
> > -{
> > - wait_event(shutdown_wq,
> > - atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters);
> > - smp_mb(); /* Wake before output. */
> > - rcu_scale_cleanup();
> > - kernel_power_off();
> > - return -EINVAL;
> > -}
> > -
> > /*
> > * kfree_rcu() scalability tests: Start a kfree_rcu() loop on all CPUs for number
> > * of iterations and measure total time and number of GP for all iterations to complete.
> > @@ -875,6 +777,109 @@ kfree_scale_init(void)
> > return firsterr;
> > }
> >
> > +static void
> > +rcu_scale_cleanup(void)
> > +{
> > + int i;
> > + int j;
> > + int ngps = 0;
> > + u64 *wdp;
> > + u64 *wdpp;
> > +
> > + /*
> > + * Would like warning at start, but everything is expedited
> > + * during the mid-boot phase, so have to wait till the end.
> > + */
> > + if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp)
> > + SCALEOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
> > + if (rcu_gp_is_normal() && gp_exp)
> > + SCALEOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
> > + if (gp_exp && gp_async)
> > + SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
> > +
> > + if (kfree_rcu_test) {
> > + kfree_scale_cleanup();
> > + return;
> > + }
> > +
> > + if (torture_cleanup_begin())
> > + return;
> > + if (!cur_ops) {
> > + torture_cleanup_end();
> > + return;
> > + }
> > +
> > + if (reader_tasks) {
> > + for (i = 0; i < nrealreaders; i++)
> > + torture_stop_kthread(rcu_scale_reader,
> > + reader_tasks[i]);
> > + kfree(reader_tasks);
> > + }
> > +
> > + if (writer_tasks) {
> > + for (i = 0; i < nrealwriters; i++) {
> > + torture_stop_kthread(rcu_scale_writer,
> > + writer_tasks[i]);
> > + if (!writer_n_durations)
> > + continue;
> > + j = writer_n_durations[i];
> > + pr_alert("%s%s writer %d gps: %d\n",
> > + scale_type, SCALE_FLAG, i, j);
> > + ngps += j;
> > + }
> > + pr_alert("%s%s start: %llu end: %llu duration: %llu gps: %d batches: %ld\n",
> > + scale_type, SCALE_FLAG,
> > + t_rcu_scale_writer_started, t_rcu_scale_writer_finished,
> > + t_rcu_scale_writer_finished -
> > + t_rcu_scale_writer_started,
> > + ngps,
> > + rcuscale_seq_diff(b_rcu_gp_test_finished,
> > + b_rcu_gp_test_started));
> > + for (i = 0; i < nrealwriters; i++) {
> > + if (!writer_durations)
> > + break;
> > + if (!writer_n_durations)
> > + continue;
> > + wdpp = writer_durations[i];
> > + if (!wdpp)
> > + continue;
> > + for (j = 0; j < writer_n_durations[i]; j++) {
> > + wdp = &wdpp[j];
> > + pr_alert("%s%s %4d writer-duration: %5d %llu\n",
> > + scale_type, SCALE_FLAG,
> > + i, j, *wdp);
> > + if (j % 100 == 0)
> > + schedule_timeout_uninterruptible(1);
> > + }
> > + kfree(writer_durations[i]);
> > + }
> > + kfree(writer_tasks);
> > + kfree(writer_durations);
> > + kfree(writer_n_durations);
> > + }
> > +
> > + /* Do torture-type-specific cleanup operations. */
> > + if (cur_ops->cleanup != NULL)
> > + cur_ops->cleanup();
> > +
> > + torture_cleanup_end();
> > +}
> > +
> > +/*
> > + * RCU scalability shutdown kthread. Just waits to be awakened, then shuts
> > + * down system.
> > + */
> > +static int
> > +rcu_scale_shutdown(void *arg)
> > +{
> > + wait_event(shutdown_wq,
> > + atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters);
> > + smp_mb(); /* Wake before output. */
> > + rcu_scale_cleanup();
> > + kernel_power_off();
> > + return -EINVAL;
> > +}
> > +
> > static int __init
> > rcu_scale_init(void)
> > {
> > --
> > 2.17.1
> >
> > > [...]
> >
> From: Paul E. McKenney <[email protected]>
> [...]
> > >
> > > How about to pull the rcu_scale_cleanup() function after
> kfree_scale_cleanup().
> > > This groups kfree_* functions and groups rcu_scale_* functions.
> > > Then the code would look cleaner.
> > > So, do you think the changes below are better?
> >
> > IMHO, I don't think doing such a code move is better. Just add a new
> > header file and declare the function there. But see what Paul says
> > first.
>
> This situation is likely to be an early hint that the kvfree_rcu() testing should
> be split out from kernel/rcu/rcuscale.c.
Another is that it's a bit expensive to create a new header file just for
eliminating a function declaration. ;-)
So, if no objections, I'd like to send out the v2 patch with the updates below:
- Move rcu_scale_cleanup() after kfree_scale_cleanup() to eliminate the
declaration of kfree_scale_cleanup(). Though this makes the patch bigger,
get the file rcuscale.c much cleaner.
- Remove the unnecessary step "modprobe torture" from the commit message.
- Add the description for why move rcu_scale_cleanup() after
kfree_scale_cleanup() to the commit message.
Thanks!
-Qiuxu
> [...]
> On Mar 16, 2023, at 9:17 AM, Zhuo, Qiuxu <[email protected]> wrote:
>
>
>>
>> From: Paul E. McKenney <[email protected]>
>> [...]
>>>>
>>>> How about to pull the rcu_scale_cleanup() function after
>> kfree_scale_cleanup().
>>>> This groups kfree_* functions and groups rcu_scale_* functions.
>>>> Then the code would look cleaner.
>>>> So, do you think the changes below are better?
>>>
>>> IMHO, I don't think doing such a code move is better. Just add a new
>>> header file and declare the function there. But see what Paul says
>>> first.
>>
>> This situation is likely to be an early hint that the kvfree_rcu() testing should
>> be split out from kernel/rcu/rcuscale.c.
>
> Another is that it's a bit expensive to create a new header file just for
> eliminating a function declaration. ;-)
What is so expensive about new files? It is a natural organization structure.
> So, if no objections, I'd like to send out the v2 patch with the updates below:
>
> - Move rcu_scale_cleanup() after kfree_scale_cleanup() to eliminate the
> declaration of kfree_scale_cleanup(). Though this makes the patch bigger,
> get the file rcuscale.c much cleaner.
>
> - Remove the unnecessary step "modprobe torture" from the commit message.
>
> - Add the description for why move rcu_scale_cleanup() after
> kfree_scale_cleanup() to the commit message.
Honestly if you are moving so many lines around, you may as well split it out into a new module.
The kfree stuff being clubbed in the same file has also been a major annoyance.
- Joel
> Thanks!
> -Qiuxu
>
>> [...]
> From: Joel Fernandes <[email protected]>
> Sent: Thursday, March 16, 2023 9:29 PM
> To: Zhuo, Qiuxu <[email protected]>
> Cc: [email protected]; Frederic Weisbecker <[email protected]>; Josh
> Triplett <[email protected]>; Neeraj Upadhyay
> <[email protected]>; Davidlohr Bueso <[email protected]>; Steven
> Rostedt <[email protected]>; Mathieu Desnoyers
> <[email protected]>; Lai Jiangshan
> <[email protected]>; [email protected]; linux-
> [email protected]
> Subject: Re: [PATCH 1/1] rcu/rcuscale: Stop kfree_scale_thread thread(s)
> after unloading rcuscale
>
>
> > On Mar 16, 2023, at 9:17 AM, Zhuo, Qiuxu <[email protected]> wrote:
> >
> >
> >>
> >> From: Paul E. McKenney <[email protected]> [...]
> >>>>
> >>>> How about to pull the rcu_scale_cleanup() function after
> >> kfree_scale_cleanup().
> >>>> This groups kfree_* functions and groups rcu_scale_* functions.
> >>>> Then the code would look cleaner.
> >>>> So, do you think the changes below are better?
> >>>
> >>> IMHO, I don't think doing such a code move is better. Just add a new
> >>> header file and declare the function there. But see what Paul says
> >>> first.
> >>
> >> This situation is likely to be an early hint that the kvfree_rcu()
> >> testing should be split out from kernel/rcu/rcuscale.c.
> >
> > Another is that it's a bit expensive to create a new header file just
> > for eliminating a function declaration. ;-)
>
> What is so expensive about new files? It is a natural organization structure.
>
> > So, if no objections, I'd like to send out the v2 patch with the updates below:
> >
> > - Move rcu_scale_cleanup() after kfree_scale_cleanup() to eliminate the
> > declaration of kfree_scale_cleanup(). Though this makes the patch bigger,
> > get the file rcuscale.c much cleaner.
> >
> > - Remove the unnecessary step "modprobe torture" from the commit
> message.
> >
> > - Add the description for why move rcu_scale_cleanup() after
> > kfree_scale_cleanup() to the commit message.
>
> Honestly if you are moving so many lines around, you may as well split it out
> into a new module.
> The kfree stuff being clubbed in the same file has also been a major
> annoyance.
I'm OK with creating a new kernel module for these kfree stuffs,
but do we really need to do that?
@paulmck, what's your suggestion for the next step?
> - Joel
>
>
> > Thanks!
> > -Qiuxu
> >
> >> [...]
On Thu, Mar 16, 2023 at 9:53 AM Zhuo, Qiuxu <[email protected]> wrote:
>
[...]
> > >> From: Paul E. McKenney <[email protected]> [...]
> > >>>>
> > >>>> How about to pull the rcu_scale_cleanup() function after
> > >> kfree_scale_cleanup().
> > >>>> This groups kfree_* functions and groups rcu_scale_* functions.
> > >>>> Then the code would look cleaner.
> > >>>> So, do you think the changes below are better?
> > >>>
> > >>> IMHO, I don't think doing such a code move is better. Just add a new
> > >>> header file and declare the function there. But see what Paul says
> > >>> first.
> > >>
> > >> This situation is likely to be an early hint that the kvfree_rcu()
> > >> testing should be split out from kernel/rcu/rcuscale.c.
> > >
> > > Another is that it's a bit expensive to create a new header file just
> > > for eliminating a function declaration. ;-)
> >
> > What is so expensive about new files? It is a natural organization structure.
> >
> > > So, if no objections, I'd like to send out the v2 patch with the updates below:
> > >
> > > - Move rcu_scale_cleanup() after kfree_scale_cleanup() to eliminate the
> > > declaration of kfree_scale_cleanup(). Though this makes the patch bigger,
> > > get the file rcuscale.c much cleaner.
> > >
> > > - Remove the unnecessary step "modprobe torture" from the commit
> > message.
> > >
> > > - Add the description for why move rcu_scale_cleanup() after
> > > kfree_scale_cleanup() to the commit message.
> >
> > Honestly if you are moving so many lines around, you may as well split it out
> > into a new module.
> > The kfree stuff being clubbed in the same file has also been a major
> > annoyance.
>
> I'm OK with creating a new kernel module for these kfree stuffs,
> but do we really need to do that?
If it were me doing this, I would try to split it just because in the
long term I may have to maintain or deal with it.
I was also thinking a new scale directory _may_ make sense for
performance tests.
kernel/rcu/scaletests/kfree.c
kernel/rcu/scaletests/core.c
kernel/rcu/scaletests/ref.c
Or something like that.
and then maybe putt common code into: kernel/rcu/scaletests/common.c
- Joel
>
> @paulmck, what's your suggestion for the next step?
>
> > - Joel
> >
> >
> > > Thanks!
> > > -Qiuxu
> > >
> > >> [...]
On Thu, Mar 16, 2023 at 10:57:06AM -0400, Joel Fernandes wrote:
> On Thu, Mar 16, 2023 at 9:53 AM Zhuo, Qiuxu <[email protected]> wrote:
> >
> [...]
> > > >> From: Paul E. McKenney <[email protected]> [...]
> > > >>>>
> > > >>>> How about to pull the rcu_scale_cleanup() function after
> > > >> kfree_scale_cleanup().
> > > >>>> This groups kfree_* functions and groups rcu_scale_* functions.
> > > >>>> Then the code would look cleaner.
> > > >>>> So, do you think the changes below are better?
> > > >>>
> > > >>> IMHO, I don't think doing such a code move is better. Just add a new
> > > >>> header file and declare the function there. But see what Paul says
> > > >>> first.
> > > >>
> > > >> This situation is likely to be an early hint that the kvfree_rcu()
> > > >> testing should be split out from kernel/rcu/rcuscale.c.
> > > >
> > > > Another is that it's a bit expensive to create a new header file just
> > > > for eliminating a function declaration. ;-)
> > >
> > > What is so expensive about new files? It is a natural organization structure.
> > >
> > > > So, if no objections, I'd like to send out the v2 patch with the updates below:
> > > >
> > > > - Move rcu_scale_cleanup() after kfree_scale_cleanup() to eliminate the
> > > > declaration of kfree_scale_cleanup(). Though this makes the patch bigger,
> > > > get the file rcuscale.c much cleaner.
> > > >
> > > > - Remove the unnecessary step "modprobe torture" from the commit
> > > message.
> > > >
> > > > - Add the description for why move rcu_scale_cleanup() after
> > > > kfree_scale_cleanup() to the commit message.
> > >
> > > Honestly if you are moving so many lines around, you may as well split it out
> > > into a new module.
> > > The kfree stuff being clubbed in the same file has also been a major
> > > annoyance.
> >
> > I'm OK with creating a new kernel module for these kfree stuffs,
> > but do we really need to do that?
It is not a particularly high priority.
> If it were me doing this, I would try to split it just because in the
> long term I may have to maintain or deal with it.
>
> I was also thinking a new scale directory _may_ make sense for
> performance tests.
>
> kernel/rcu/scaletests/kfree.c
> kernel/rcu/scaletests/core.c
> kernel/rcu/scaletests/ref.c
>
> Or something like that.
I don't believe we are there yet, but...
> and then maybe putt common code into: kernel/rcu/scaletests/common.c
...splitting out the common code within the current directory/file
structure makes a lot of sense to me. Not that I have checked up on
exactly how much common code there really is. ;-)
Thanx, Paul
> - Joel
>
> >
> > @paulmck, what's your suggestion for the next step?
> >
> > > - Joel
> > >
> > >
> > > > Thanks!
> > > > -Qiuxu
> > > >
> > > >> [...]
> From: Joel Fernandes <[email protected]>
> [...]
> > I'm OK with creating a new kernel module for these kfree stuffs, but
> > do we really need to do that?
>
> If it were me doing this, I would try to split it just because in the long term I
> may have to maintain or deal with it.
>
> I was also thinking a new scale directory _may_ make sense for performance
> tests.
>
> kernel/rcu/scaletests/kfree.c
> kernel/rcu/scaletests/core.c
> kernel/rcu/scaletests/ref.c
This looks like really a good constructive suggestion.
But today, please give me a break. ;-).
Thanks!
-Qiuxu
> Or something like that.
>
> and then maybe putt common code into: kernel/rcu/scaletests/common.c
When running the 'kfree_rcu_test' test case with commands [1] the call
trace [2] was thrown. This was because the kfree_scale_thread thread(s)
still run after unloading rcuscale and torture modules. Fix the call
trace by invoking kfree_scale_cleanup() from rcu_scale_cleanup() when
removing the rcuscale module.
Additionally, current rcuscale.c defines kfree_scale_cleanup() after
rcu_scale_cleanup(), to avoid the declaration of kfree_scale_cleanup()
when rcu_scale_cleanup() invoking kfree_scale_cleanup(), move
rcu_scale_cleanup() after kfree_scale_cleanup().
[1] modprobe rcuscale kfree_rcu_test=1
// After some time
rmmod rcuscale
rmmod torture
[2] BUG: unable to handle page fault for address: ffffffffc0601a87
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 11de4f067 P4D 11de4f067 PUD 11de51067 PMD 112f4d067 PTE 0
Oops: 0010 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 1798 Comm: kfree_scale_thr Not tainted 6.3.0-rc1-rcu+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
RIP: 0010:0xffffffffc0601a87
Code: Unable to access opcode bytes at 0xffffffffc0601a5d.
RSP: 0018:ffffb25bc2e57e18 EFLAGS: 00010297
RAX: 0000000000000000 RBX: ffffffffc061f0b6 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff962fd0de RDI: ffffffff962fd0de
RBP: ffffb25bc2e57ea8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 000000000000000a R15: 00000000001c1dbe
FS: 0000000000000000(0000) GS:ffff921fa2200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffc0601a5d CR3: 000000011de4c006 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? kvfree_call_rcu+0xf0/0x3a0
? kthread+0xf3/0x120
? kthread_complete_and_exit+0x20/0x20
? ret_from_fork+0x1f/0x30
</TASK>
Modules linked in: rfkill sunrpc ... [last unloaded: torture]
CR2: ffffffffc0601a87
---[ end trace 0000000000000000 ]---
Fixes: e6e78b004fa7 ("rcuperf: Add kfree_rcu() performance Tests")
Signed-off-by: Qiuxu Zhuo <[email protected]>
---
v1 -> v2:
- Move rcu_scale_cleanup() after kfree_scale_cleanup() to eliminate the
declaration of kfree_scale_cleanup().
- Remove the unnecessary step "modprobe torture" from the commit message.
- Add the description for why move rcu_scale_cleanup() after
kfree_scale_cleanup() to the commit message.
Thanks Paul's comments on eliminating the extra function declaration and
removing the unnecessary "modprobe torture" from the commit message.
Thanks Joel's constructive comments that for long-term maintenance we may
need to split out the common code within current {ref,rcu}scale.c files.
kernel/rcu/rcuscale.c | 201 ++++++++++++++++++++++--------------------
1 file changed, 103 insertions(+), 98 deletions(-)
diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index 91fb5905a008..5a000d26f03e 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -522,89 +522,6 @@ rcu_scale_print_module_parms(struct rcu_scale_ops *cur_ops, const char *tag)
scale_type, tag, nrealreaders, nrealwriters, verbose, shutdown);
}
-static void
-rcu_scale_cleanup(void)
-{
- int i;
- int j;
- int ngps = 0;
- u64 *wdp;
- u64 *wdpp;
-
- /*
- * Would like warning at start, but everything is expedited
- * during the mid-boot phase, so have to wait till the end.
- */
- if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp)
- SCALEOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
- if (rcu_gp_is_normal() && gp_exp)
- SCALEOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
- if (gp_exp && gp_async)
- SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
-
- if (torture_cleanup_begin())
- return;
- if (!cur_ops) {
- torture_cleanup_end();
- return;
- }
-
- if (reader_tasks) {
- for (i = 0; i < nrealreaders; i++)
- torture_stop_kthread(rcu_scale_reader,
- reader_tasks[i]);
- kfree(reader_tasks);
- }
-
- if (writer_tasks) {
- for (i = 0; i < nrealwriters; i++) {
- torture_stop_kthread(rcu_scale_writer,
- writer_tasks[i]);
- if (!writer_n_durations)
- continue;
- j = writer_n_durations[i];
- pr_alert("%s%s writer %d gps: %d\n",
- scale_type, SCALE_FLAG, i, j);
- ngps += j;
- }
- pr_alert("%s%s start: %llu end: %llu duration: %llu gps: %d batches: %ld\n",
- scale_type, SCALE_FLAG,
- t_rcu_scale_writer_started, t_rcu_scale_writer_finished,
- t_rcu_scale_writer_finished -
- t_rcu_scale_writer_started,
- ngps,
- rcuscale_seq_diff(b_rcu_gp_test_finished,
- b_rcu_gp_test_started));
- for (i = 0; i < nrealwriters; i++) {
- if (!writer_durations)
- break;
- if (!writer_n_durations)
- continue;
- wdpp = writer_durations[i];
- if (!wdpp)
- continue;
- for (j = 0; j < writer_n_durations[i]; j++) {
- wdp = &wdpp[j];
- pr_alert("%s%s %4d writer-duration: %5d %llu\n",
- scale_type, SCALE_FLAG,
- i, j, *wdp);
- if (j % 100 == 0)
- schedule_timeout_uninterruptible(1);
- }
- kfree(writer_durations[i]);
- }
- kfree(writer_tasks);
- kfree(writer_durations);
- kfree(writer_n_durations);
- }
-
- /* Do torture-type-specific cleanup operations. */
- if (cur_ops->cleanup != NULL)
- cur_ops->cleanup();
-
- torture_cleanup_end();
-}
-
/*
* Return the number if non-negative. If -1, the number of CPUs.
* If less than -1, that much less than the number of CPUs, but
@@ -624,21 +541,6 @@ static int compute_real(int n)
return nr;
}
-/*
- * RCU scalability shutdown kthread. Just waits to be awakened, then shuts
- * down system.
- */
-static int
-rcu_scale_shutdown(void *arg)
-{
- wait_event(shutdown_wq,
- atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters);
- smp_mb(); /* Wake before output. */
- rcu_scale_cleanup();
- kernel_power_off();
- return -EINVAL;
-}
-
/*
* kfree_rcu() scalability tests: Start a kfree_rcu() loop on all CPUs for number
* of iterations and measure total time and number of GP for all iterations to complete.
@@ -875,6 +777,109 @@ kfree_scale_init(void)
return firsterr;
}
+static void
+rcu_scale_cleanup(void)
+{
+ int i;
+ int j;
+ int ngps = 0;
+ u64 *wdp;
+ u64 *wdpp;
+
+ /*
+ * Would like warning at start, but everything is expedited
+ * during the mid-boot phase, so have to wait till the end.
+ */
+ if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp)
+ SCALEOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
+ if (rcu_gp_is_normal() && gp_exp)
+ SCALEOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
+ if (gp_exp && gp_async)
+ SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
+
+ if (kfree_rcu_test) {
+ kfree_scale_cleanup();
+ return;
+ }
+
+ if (torture_cleanup_begin())
+ return;
+ if (!cur_ops) {
+ torture_cleanup_end();
+ return;
+ }
+
+ if (reader_tasks) {
+ for (i = 0; i < nrealreaders; i++)
+ torture_stop_kthread(rcu_scale_reader,
+ reader_tasks[i]);
+ kfree(reader_tasks);
+ }
+
+ if (writer_tasks) {
+ for (i = 0; i < nrealwriters; i++) {
+ torture_stop_kthread(rcu_scale_writer,
+ writer_tasks[i]);
+ if (!writer_n_durations)
+ continue;
+ j = writer_n_durations[i];
+ pr_alert("%s%s writer %d gps: %d\n",
+ scale_type, SCALE_FLAG, i, j);
+ ngps += j;
+ }
+ pr_alert("%s%s start: %llu end: %llu duration: %llu gps: %d batches: %ld\n",
+ scale_type, SCALE_FLAG,
+ t_rcu_scale_writer_started, t_rcu_scale_writer_finished,
+ t_rcu_scale_writer_finished -
+ t_rcu_scale_writer_started,
+ ngps,
+ rcuscale_seq_diff(b_rcu_gp_test_finished,
+ b_rcu_gp_test_started));
+ for (i = 0; i < nrealwriters; i++) {
+ if (!writer_durations)
+ break;
+ if (!writer_n_durations)
+ continue;
+ wdpp = writer_durations[i];
+ if (!wdpp)
+ continue;
+ for (j = 0; j < writer_n_durations[i]; j++) {
+ wdp = &wdpp[j];
+ pr_alert("%s%s %4d writer-duration: %5d %llu\n",
+ scale_type, SCALE_FLAG,
+ i, j, *wdp);
+ if (j % 100 == 0)
+ schedule_timeout_uninterruptible(1);
+ }
+ kfree(writer_durations[i]);
+ }
+ kfree(writer_tasks);
+ kfree(writer_durations);
+ kfree(writer_n_durations);
+ }
+
+ /* Do torture-type-specific cleanup operations. */
+ if (cur_ops->cleanup != NULL)
+ cur_ops->cleanup();
+
+ torture_cleanup_end();
+}
+
+/*
+ * RCU scalability shutdown kthread. Just waits to be awakened, then shuts
+ * down system.
+ */
+static int
+rcu_scale_shutdown(void *arg)
+{
+ wait_event(shutdown_wq,
+ atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters);
+ smp_mb(); /* Wake before output. */
+ rcu_scale_cleanup();
+ kernel_power_off();
+ return -EINVAL;
+}
+
static int __init
rcu_scale_init(void)
{
--
2.17.1
On Fri, Mar 17, 2023 at 01:50:06PM +0800, Qiuxu Zhuo wrote:
> When running the 'kfree_rcu_test' test case with commands [1] the call
> trace [2] was thrown. This was because the kfree_scale_thread thread(s)
> still run after unloading rcuscale and torture modules. Fix the call
> trace by invoking kfree_scale_cleanup() from rcu_scale_cleanup() when
> removing the rcuscale module.
>
> Additionally, current rcuscale.c defines kfree_scale_cleanup() after
> rcu_scale_cleanup(), to avoid the declaration of kfree_scale_cleanup()
> when rcu_scale_cleanup() invoking kfree_scale_cleanup(), move
> rcu_scale_cleanup() after kfree_scale_cleanup().
>
> [1] modprobe rcuscale kfree_rcu_test=1
> // After some time
> rmmod rcuscale
> rmmod torture
>
> [2] BUG: unable to handle page fault for address: ffffffffc0601a87
> #PF: supervisor instruction fetch in kernel mode
> #PF: error_code(0x0010) - not-present page
> PGD 11de4f067 P4D 11de4f067 PUD 11de51067 PMD 112f4d067 PTE 0
> Oops: 0010 [#1] PREEMPT SMP NOPTI
> CPU: 1 PID: 1798 Comm: kfree_scale_thr Not tainted 6.3.0-rc1-rcu+ #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
> RIP: 0010:0xffffffffc0601a87
> Code: Unable to access opcode bytes at 0xffffffffc0601a5d.
> RSP: 0018:ffffb25bc2e57e18 EFLAGS: 00010297
> RAX: 0000000000000000 RBX: ffffffffc061f0b6 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffffff962fd0de RDI: ffffffff962fd0de
> RBP: ffffb25bc2e57ea8 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
> R13: 0000000000000000 R14: 000000000000000a R15: 00000000001c1dbe
> FS: 0000000000000000(0000) GS:ffff921fa2200000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffc0601a5d CR3: 000000011de4c006 CR4: 0000000000370ee0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> ? kvfree_call_rcu+0xf0/0x3a0
> ? kthread+0xf3/0x120
> ? kthread_complete_and_exit+0x20/0x20
> ? ret_from_fork+0x1f/0x30
> </TASK>
> Modules linked in: rfkill sunrpc ... [last unloaded: torture]
> CR2: ffffffffc0601a87
> ---[ end trace 0000000000000000 ]---
>
> Fixes: e6e78b004fa7 ("rcuperf: Add kfree_rcu() performance Tests")
> Signed-off-by: Qiuxu Zhuo <[email protected]>
Much better!
Except that someone glancing at this patch would be hard pressed to
see what changed.
So could you please split this into two patches, the first of which
does nothing but move code, and the second of which makes the actual
change?
The commit log for the first patch needs to clearly state that the
it is code-motion-only, with no change in behavior.
Thanx, Paul
> ---
> v1 -> v2:
>
> - Move rcu_scale_cleanup() after kfree_scale_cleanup() to eliminate the
> declaration of kfree_scale_cleanup().
>
> - Remove the unnecessary step "modprobe torture" from the commit message.
>
> - Add the description for why move rcu_scale_cleanup() after
> kfree_scale_cleanup() to the commit message.
>
> Thanks Paul's comments on eliminating the extra function declaration and
> removing the unnecessary "modprobe torture" from the commit message.
>
> Thanks Joel's constructive comments that for long-term maintenance we may
> need to split out the common code within current {ref,rcu}scale.c files.
>
> kernel/rcu/rcuscale.c | 201 ++++++++++++++++++++++--------------------
> 1 file changed, 103 insertions(+), 98 deletions(-)
>
> diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
> index 91fb5905a008..5a000d26f03e 100644
> --- a/kernel/rcu/rcuscale.c
> +++ b/kernel/rcu/rcuscale.c
> @@ -522,89 +522,6 @@ rcu_scale_print_module_parms(struct rcu_scale_ops *cur_ops, const char *tag)
> scale_type, tag, nrealreaders, nrealwriters, verbose, shutdown);
> }
>
> -static void
> -rcu_scale_cleanup(void)
> -{
> - int i;
> - int j;
> - int ngps = 0;
> - u64 *wdp;
> - u64 *wdpp;
> -
> - /*
> - * Would like warning at start, but everything is expedited
> - * during the mid-boot phase, so have to wait till the end.
> - */
> - if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp)
> - SCALEOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
> - if (rcu_gp_is_normal() && gp_exp)
> - SCALEOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
> - if (gp_exp && gp_async)
> - SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
> -
> - if (torture_cleanup_begin())
> - return;
> - if (!cur_ops) {
> - torture_cleanup_end();
> - return;
> - }
> -
> - if (reader_tasks) {
> - for (i = 0; i < nrealreaders; i++)
> - torture_stop_kthread(rcu_scale_reader,
> - reader_tasks[i]);
> - kfree(reader_tasks);
> - }
> -
> - if (writer_tasks) {
> - for (i = 0; i < nrealwriters; i++) {
> - torture_stop_kthread(rcu_scale_writer,
> - writer_tasks[i]);
> - if (!writer_n_durations)
> - continue;
> - j = writer_n_durations[i];
> - pr_alert("%s%s writer %d gps: %d\n",
> - scale_type, SCALE_FLAG, i, j);
> - ngps += j;
> - }
> - pr_alert("%s%s start: %llu end: %llu duration: %llu gps: %d batches: %ld\n",
> - scale_type, SCALE_FLAG,
> - t_rcu_scale_writer_started, t_rcu_scale_writer_finished,
> - t_rcu_scale_writer_finished -
> - t_rcu_scale_writer_started,
> - ngps,
> - rcuscale_seq_diff(b_rcu_gp_test_finished,
> - b_rcu_gp_test_started));
> - for (i = 0; i < nrealwriters; i++) {
> - if (!writer_durations)
> - break;
> - if (!writer_n_durations)
> - continue;
> - wdpp = writer_durations[i];
> - if (!wdpp)
> - continue;
> - for (j = 0; j < writer_n_durations[i]; j++) {
> - wdp = &wdpp[j];
> - pr_alert("%s%s %4d writer-duration: %5d %llu\n",
> - scale_type, SCALE_FLAG,
> - i, j, *wdp);
> - if (j % 100 == 0)
> - schedule_timeout_uninterruptible(1);
> - }
> - kfree(writer_durations[i]);
> - }
> - kfree(writer_tasks);
> - kfree(writer_durations);
> - kfree(writer_n_durations);
> - }
> -
> - /* Do torture-type-specific cleanup operations. */
> - if (cur_ops->cleanup != NULL)
> - cur_ops->cleanup();
> -
> - torture_cleanup_end();
> -}
> -
> /*
> * Return the number if non-negative. If -1, the number of CPUs.
> * If less than -1, that much less than the number of CPUs, but
> @@ -624,21 +541,6 @@ static int compute_real(int n)
> return nr;
> }
>
> -/*
> - * RCU scalability shutdown kthread. Just waits to be awakened, then shuts
> - * down system.
> - */
> -static int
> -rcu_scale_shutdown(void *arg)
> -{
> - wait_event(shutdown_wq,
> - atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters);
> - smp_mb(); /* Wake before output. */
> - rcu_scale_cleanup();
> - kernel_power_off();
> - return -EINVAL;
> -}
> -
> /*
> * kfree_rcu() scalability tests: Start a kfree_rcu() loop on all CPUs for number
> * of iterations and measure total time and number of GP for all iterations to complete.
> @@ -875,6 +777,109 @@ kfree_scale_init(void)
> return firsterr;
> }
>
> +static void
> +rcu_scale_cleanup(void)
> +{
> + int i;
> + int j;
> + int ngps = 0;
> + u64 *wdp;
> + u64 *wdpp;
> +
> + /*
> + * Would like warning at start, but everything is expedited
> + * during the mid-boot phase, so have to wait till the end.
> + */
> + if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp)
> + SCALEOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
> + if (rcu_gp_is_normal() && gp_exp)
> + SCALEOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
> + if (gp_exp && gp_async)
> + SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
> +
> + if (kfree_rcu_test) {
> + kfree_scale_cleanup();
> + return;
> + }
> +
> + if (torture_cleanup_begin())
> + return;
> + if (!cur_ops) {
> + torture_cleanup_end();
> + return;
> + }
> +
> + if (reader_tasks) {
> + for (i = 0; i < nrealreaders; i++)
> + torture_stop_kthread(rcu_scale_reader,
> + reader_tasks[i]);
> + kfree(reader_tasks);
> + }
> +
> + if (writer_tasks) {
> + for (i = 0; i < nrealwriters; i++) {
> + torture_stop_kthread(rcu_scale_writer,
> + writer_tasks[i]);
> + if (!writer_n_durations)
> + continue;
> + j = writer_n_durations[i];
> + pr_alert("%s%s writer %d gps: %d\n",
> + scale_type, SCALE_FLAG, i, j);
> + ngps += j;
> + }
> + pr_alert("%s%s start: %llu end: %llu duration: %llu gps: %d batches: %ld\n",
> + scale_type, SCALE_FLAG,
> + t_rcu_scale_writer_started, t_rcu_scale_writer_finished,
> + t_rcu_scale_writer_finished -
> + t_rcu_scale_writer_started,
> + ngps,
> + rcuscale_seq_diff(b_rcu_gp_test_finished,
> + b_rcu_gp_test_started));
> + for (i = 0; i < nrealwriters; i++) {
> + if (!writer_durations)
> + break;
> + if (!writer_n_durations)
> + continue;
> + wdpp = writer_durations[i];
> + if (!wdpp)
> + continue;
> + for (j = 0; j < writer_n_durations[i]; j++) {
> + wdp = &wdpp[j];
> + pr_alert("%s%s %4d writer-duration: %5d %llu\n",
> + scale_type, SCALE_FLAG,
> + i, j, *wdp);
> + if (j % 100 == 0)
> + schedule_timeout_uninterruptible(1);
> + }
> + kfree(writer_durations[i]);
> + }
> + kfree(writer_tasks);
> + kfree(writer_durations);
> + kfree(writer_n_durations);
> + }
> +
> + /* Do torture-type-specific cleanup operations. */
> + if (cur_ops->cleanup != NULL)
> + cur_ops->cleanup();
> +
> + torture_cleanup_end();
> +}
> +
> +/*
> + * RCU scalability shutdown kthread. Just waits to be awakened, then shuts
> + * down system.
> + */
> +static int
> +rcu_scale_shutdown(void *arg)
> +{
> + wait_event(shutdown_wq,
> + atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters);
> + smp_mb(); /* Wake before output. */
> + rcu_scale_cleanup();
> + kernel_power_off();
> + return -EINVAL;
> +}
> +
> static int __init
> rcu_scale_init(void)
> {
> --
> 2.17.1
>
> From: Paul E. McKenney <[email protected]>
> Sent: Tuesday, March 21, 2023 4:41 AM
> [...]
> >
> > Fixes: e6e78b004fa7 ("rcuperf: Add kfree_rcu() performance Tests")
> > Signed-off-by: Qiuxu Zhuo <[email protected]>
>
> Much better!
>
> Except that someone glancing at this patch would be hard pressed to see
> what changed.
Indeed.
> So could you please split this into two patches, the first of which does
> nothing but move code, and the second of which makes the actual change?
OK.
Will split this patch into two patches in the v3.
> The commit log for the first patch needs to clearly state that the it is code-
> motion-only, with no change in behavior.
OK.
Thanks so much for your suggestions ;-)
>
> Thanx, Paul
On Tue, Mar 21, 2023 at 01:21:21AM +0000, Zhuo, Qiuxu wrote:
> > From: Paul E. McKenney <[email protected]>
> > Sent: Tuesday, March 21, 2023 4:41 AM
> > [...]
> > >
> > > Fixes: e6e78b004fa7 ("rcuperf: Add kfree_rcu() performance Tests")
> > > Signed-off-by: Qiuxu Zhuo <[email protected]>
> >
> > Much better!
> >
> > Except that someone glancing at this patch would be hard pressed to see
> > what changed.
>
> Indeed.
>
> > So could you please split this into two patches, the first of which does
> > nothing but move code, and the second of which makes the actual change?
>
> OK.
> Will split this patch into two patches in the v3.
>
> > The commit log for the first patch needs to clearly state that the it is code-
> > motion-only, with no change in behavior.
>
> OK.
> Thanks so much for your suggestions ;-)
Looking forward to seeing the next version!
Thanx, Paul
Move rcu_scale_cleanup() and rcu_scale_shutdown() functions after
kfree_scale_cleanup(). There are no function changes.
The intention of this moving is a preparation for a subsequent patch
that will fix a call-trace bug by invoking kfree_scale_cleanup() from
rcu_scale_cleanup() without the declaration of kfree_scale_cleanup().
Signed-off-by: Qiuxu Zhuo <[email protected]>
---
Paul, please modify the commit message if needed. Thanks!
kernel/rcu/rcuscale.c | 196 +++++++++++++++++++++---------------------
1 file changed, 98 insertions(+), 98 deletions(-)
diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index 91fb5905a008..e99096a4f094 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -522,89 +522,6 @@ rcu_scale_print_module_parms(struct rcu_scale_ops *cur_ops, const char *tag)
scale_type, tag, nrealreaders, nrealwriters, verbose, shutdown);
}
-static void
-rcu_scale_cleanup(void)
-{
- int i;
- int j;
- int ngps = 0;
- u64 *wdp;
- u64 *wdpp;
-
- /*
- * Would like warning at start, but everything is expedited
- * during the mid-boot phase, so have to wait till the end.
- */
- if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp)
- SCALEOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
- if (rcu_gp_is_normal() && gp_exp)
- SCALEOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
- if (gp_exp && gp_async)
- SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
-
- if (torture_cleanup_begin())
- return;
- if (!cur_ops) {
- torture_cleanup_end();
- return;
- }
-
- if (reader_tasks) {
- for (i = 0; i < nrealreaders; i++)
- torture_stop_kthread(rcu_scale_reader,
- reader_tasks[i]);
- kfree(reader_tasks);
- }
-
- if (writer_tasks) {
- for (i = 0; i < nrealwriters; i++) {
- torture_stop_kthread(rcu_scale_writer,
- writer_tasks[i]);
- if (!writer_n_durations)
- continue;
- j = writer_n_durations[i];
- pr_alert("%s%s writer %d gps: %d\n",
- scale_type, SCALE_FLAG, i, j);
- ngps += j;
- }
- pr_alert("%s%s start: %llu end: %llu duration: %llu gps: %d batches: %ld\n",
- scale_type, SCALE_FLAG,
- t_rcu_scale_writer_started, t_rcu_scale_writer_finished,
- t_rcu_scale_writer_finished -
- t_rcu_scale_writer_started,
- ngps,
- rcuscale_seq_diff(b_rcu_gp_test_finished,
- b_rcu_gp_test_started));
- for (i = 0; i < nrealwriters; i++) {
- if (!writer_durations)
- break;
- if (!writer_n_durations)
- continue;
- wdpp = writer_durations[i];
- if (!wdpp)
- continue;
- for (j = 0; j < writer_n_durations[i]; j++) {
- wdp = &wdpp[j];
- pr_alert("%s%s %4d writer-duration: %5d %llu\n",
- scale_type, SCALE_FLAG,
- i, j, *wdp);
- if (j % 100 == 0)
- schedule_timeout_uninterruptible(1);
- }
- kfree(writer_durations[i]);
- }
- kfree(writer_tasks);
- kfree(writer_durations);
- kfree(writer_n_durations);
- }
-
- /* Do torture-type-specific cleanup operations. */
- if (cur_ops->cleanup != NULL)
- cur_ops->cleanup();
-
- torture_cleanup_end();
-}
-
/*
* Return the number if non-negative. If -1, the number of CPUs.
* If less than -1, that much less than the number of CPUs, but
@@ -624,21 +541,6 @@ static int compute_real(int n)
return nr;
}
-/*
- * RCU scalability shutdown kthread. Just waits to be awakened, then shuts
- * down system.
- */
-static int
-rcu_scale_shutdown(void *arg)
-{
- wait_event(shutdown_wq,
- atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters);
- smp_mb(); /* Wake before output. */
- rcu_scale_cleanup();
- kernel_power_off();
- return -EINVAL;
-}
-
/*
* kfree_rcu() scalability tests: Start a kfree_rcu() loop on all CPUs for number
* of iterations and measure total time and number of GP for all iterations to complete.
@@ -875,6 +777,104 @@ kfree_scale_init(void)
return firsterr;
}
+static void
+rcu_scale_cleanup(void)
+{
+ int i;
+ int j;
+ int ngps = 0;
+ u64 *wdp;
+ u64 *wdpp;
+
+ /*
+ * Would like warning at start, but everything is expedited
+ * during the mid-boot phase, so have to wait till the end.
+ */
+ if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp)
+ SCALEOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
+ if (rcu_gp_is_normal() && gp_exp)
+ SCALEOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
+ if (gp_exp && gp_async)
+ SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
+
+ if (torture_cleanup_begin())
+ return;
+ if (!cur_ops) {
+ torture_cleanup_end();
+ return;
+ }
+
+ if (reader_tasks) {
+ for (i = 0; i < nrealreaders; i++)
+ torture_stop_kthread(rcu_scale_reader,
+ reader_tasks[i]);
+ kfree(reader_tasks);
+ }
+
+ if (writer_tasks) {
+ for (i = 0; i < nrealwriters; i++) {
+ torture_stop_kthread(rcu_scale_writer,
+ writer_tasks[i]);
+ if (!writer_n_durations)
+ continue;
+ j = writer_n_durations[i];
+ pr_alert("%s%s writer %d gps: %d\n",
+ scale_type, SCALE_FLAG, i, j);
+ ngps += j;
+ }
+ pr_alert("%s%s start: %llu end: %llu duration: %llu gps: %d batches: %ld\n",
+ scale_type, SCALE_FLAG,
+ t_rcu_scale_writer_started, t_rcu_scale_writer_finished,
+ t_rcu_scale_writer_finished -
+ t_rcu_scale_writer_started,
+ ngps,
+ rcuscale_seq_diff(b_rcu_gp_test_finished,
+ b_rcu_gp_test_started));
+ for (i = 0; i < nrealwriters; i++) {
+ if (!writer_durations)
+ break;
+ if (!writer_n_durations)
+ continue;
+ wdpp = writer_durations[i];
+ if (!wdpp)
+ continue;
+ for (j = 0; j < writer_n_durations[i]; j++) {
+ wdp = &wdpp[j];
+ pr_alert("%s%s %4d writer-duration: %5d %llu\n",
+ scale_type, SCALE_FLAG,
+ i, j, *wdp);
+ if (j % 100 == 0)
+ schedule_timeout_uninterruptible(1);
+ }
+ kfree(writer_durations[i]);
+ }
+ kfree(writer_tasks);
+ kfree(writer_durations);
+ kfree(writer_n_durations);
+ }
+
+ /* Do torture-type-specific cleanup operations. */
+ if (cur_ops->cleanup != NULL)
+ cur_ops->cleanup();
+
+ torture_cleanup_end();
+}
+
+/*
+ * RCU scalability shutdown kthread. Just waits to be awakened, then shuts
+ * down system.
+ */
+static int
+rcu_scale_shutdown(void *arg)
+{
+ wait_event(shutdown_wq,
+ atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters);
+ smp_mb(); /* Wake before output. */
+ rcu_scale_cleanup();
+ kernel_power_off();
+ return -EINVAL;
+}
+
static int __init
rcu_scale_init(void)
{
--
2.17.1
When running the 'kfree_rcu_test' test case with commands [1] the call
trace [2] was thrown. This was because the kfree_scale_thread thread(s)
still run after unloading rcuscale and torture modules. Fix the call
trace by invoking kfree_scale_cleanup() from rcu_scale_cleanup() when
removing the rcuscale module.
[1] modprobe rcuscale kfree_rcu_test=1
// After some time
rmmod rcuscale
rmmod torture
[2] BUG: unable to handle page fault for address: ffffffffc0601a87
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 11de4f067 P4D 11de4f067 PUD 11de51067 PMD 112f4d067 PTE 0
Oops: 0010 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 1798 Comm: kfree_scale_thr Not tainted 6.3.0-rc1-rcu+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
RIP: 0010:0xffffffffc0601a87
Code: Unable to access opcode bytes at 0xffffffffc0601a5d.
RSP: 0018:ffffb25bc2e57e18 EFLAGS: 00010297
RAX: 0000000000000000 RBX: ffffffffc061f0b6 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff962fd0de RDI: ffffffff962fd0de
RBP: ffffb25bc2e57ea8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 000000000000000a R15: 00000000001c1dbe
FS: 0000000000000000(0000) GS:ffff921fa2200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffc0601a5d CR3: 000000011de4c006 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? kvfree_call_rcu+0xf0/0x3a0
? kthread+0xf3/0x120
? kthread_complete_and_exit+0x20/0x20
? ret_from_fork+0x1f/0x30
</TASK>
Modules linked in: rfkill sunrpc ... [last unloaded: torture]
CR2: ffffffffc0601a87
---[ end trace 0000000000000000 ]---
Fixes: e6e78b004fa7 ("rcuperf: Add kfree_rcu() performance Tests")
Signed-off-by: Qiuxu Zhuo <[email protected]>
---
v1 -> v2:
- Move rcu_scale_cleanup() after kfree_scale_cleanup() to eliminate the
declaration of kfree_scale_cleanup().
- Remove the unnecessary step "modprobe torture" from the commit message.
- Add the description for why move rcu_scale_cleanup() after
kfree_scale_cleanup() to the commit message.
v2 -> v3:
- Split the single v2 patch into two patches.
- Move the commit message description for why move rcu_scale_cleanup()
after kfree_scale_cleanup() to Patch 1.
kernel/rcu/rcuscale.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index e99096a4f094..5a000d26f03e 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -797,6 +797,11 @@ rcu_scale_cleanup(void)
if (gp_exp && gp_async)
SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
+ if (kfree_rcu_test) {
+ kfree_scale_cleanup();
+ return;
+ }
+
if (torture_cleanup_begin())
return;
if (!cur_ops) {
--
2.17.1
On Tue, 21 Mar 2023, Qiuxu Zhuo wrote:
>When running the 'kfree_rcu_test' test case with commands [1] the call
>trace [2] was thrown. This was because the kfree_scale_thread thread(s)
>still run after unloading rcuscale and torture modules. Fix the call
>trace by invoking kfree_scale_cleanup() from rcu_scale_cleanup() when
>removing the rcuscale module.
>
>[1] modprobe rcuscale kfree_rcu_test=1
> // After some time
> rmmod rcuscale
> rmmod torture
>
>[2] BUG: unable to handle page fault for address: ffffffffc0601a87
> #PF: supervisor instruction fetch in kernel mode
> #PF: error_code(0x0010) - not-present page
> PGD 11de4f067 P4D 11de4f067 PUD 11de51067 PMD 112f4d067 PTE 0
> Oops: 0010 [#1] PREEMPT SMP NOPTI
> CPU: 1 PID: 1798 Comm: kfree_scale_thr Not tainted 6.3.0-rc1-rcu+ #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
> RIP: 0010:0xffffffffc0601a87
> Code: Unable to access opcode bytes at 0xffffffffc0601a5d.
> RSP: 0018:ffffb25bc2e57e18 EFLAGS: 00010297
> RAX: 0000000000000000 RBX: ffffffffc061f0b6 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffffff962fd0de RDI: ffffffff962fd0de
> RBP: ffffb25bc2e57ea8 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
> R13: 0000000000000000 R14: 000000000000000a R15: 00000000001c1dbe
> FS: 0000000000000000(0000) GS:ffff921fa2200000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffc0601a5d CR3: 000000011de4c006 CR4: 0000000000370ee0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> ? kvfree_call_rcu+0xf0/0x3a0
> ? kthread+0xf3/0x120
> ? kthread_complete_and_exit+0x20/0x20
> ? ret_from_fork+0x1f/0x30
> </TASK>
> Modules linked in: rfkill sunrpc ... [last unloaded: torture]
> CR2: ffffffffc0601a87
> ---[ end trace 0000000000000000 ]---
>
>Fixes: e6e78b004fa7 ("rcuperf: Add kfree_rcu() performance Tests")
>Signed-off-by: Qiuxu Zhuo <[email protected]>
Reviewed-by: Davidlohr Bueso <[email protected]>
On Tue, Mar 21, 2023 at 08:47:51AM -0700, Davidlohr Bueso wrote:
> On Tue, 21 Mar 2023, Qiuxu Zhuo wrote:
>
> > When running the 'kfree_rcu_test' test case with commands [1] the call
> > trace [2] was thrown. This was because the kfree_scale_thread thread(s)
> > still run after unloading rcuscale and torture modules. Fix the call
> > trace by invoking kfree_scale_cleanup() from rcu_scale_cleanup() when
> > removing the rcuscale module.
> >
> > [1] modprobe rcuscale kfree_rcu_test=1
> > // After some time
> > rmmod rcuscale
> > rmmod torture
> >
> > [2] BUG: unable to handle page fault for address: ffffffffc0601a87
> > #PF: supervisor instruction fetch in kernel mode
> > #PF: error_code(0x0010) - not-present page
> > PGD 11de4f067 P4D 11de4f067 PUD 11de51067 PMD 112f4d067 PTE 0
> > Oops: 0010 [#1] PREEMPT SMP NOPTI
> > CPU: 1 PID: 1798 Comm: kfree_scale_thr Not tainted 6.3.0-rc1-rcu+ #1
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
> > RIP: 0010:0xffffffffc0601a87
> > Code: Unable to access opcode bytes at 0xffffffffc0601a5d.
> > RSP: 0018:ffffb25bc2e57e18 EFLAGS: 00010297
> > RAX: 0000000000000000 RBX: ffffffffc061f0b6 RCX: 0000000000000000
> > RDX: 0000000000000000 RSI: ffffffff962fd0de RDI: ffffffff962fd0de
> > RBP: ffffb25bc2e57ea8 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
> > R13: 0000000000000000 R14: 000000000000000a R15: 00000000001c1dbe
> > FS: 0000000000000000(0000) GS:ffff921fa2200000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: ffffffffc0601a5d CR3: 000000011de4c006 CR4: 0000000000370ee0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> > <TASK>
> > ? kvfree_call_rcu+0xf0/0x3a0
> > ? kthread+0xf3/0x120
> > ? kthread_complete_and_exit+0x20/0x20
> > ? ret_from_fork+0x1f/0x30
> > </TASK>
> > Modules linked in: rfkill sunrpc ... [last unloaded: torture]
> > CR2: ffffffffc0601a87
> > ---[ end trace 0000000000000000 ]---
> >
> > Fixes: e6e78b004fa7 ("rcuperf: Add kfree_rcu() performance Tests")
> > Signed-off-by: Qiuxu Zhuo <[email protected]>
>
> Reviewed-by: Davidlohr Bueso <[email protected]>
Much better, thank you both!
But unfortunately, these patches do not apply cleanly. Qiuxu Zhuo,
could you please forward port these to the -rcu "dev" branch [1]?
Thanx, Paul
[1] https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/rcutodo.html
On Tue, Mar 21, 2023 at 3:24 PM Paul E. McKenney <[email protected]> wrote:
>
> On Tue, Mar 21, 2023 at 08:47:51AM -0700, Davidlohr Bueso wrote:
> > On Tue, 21 Mar 2023, Qiuxu Zhuo wrote:
> >
> > > When running the 'kfree_rcu_test' test case with commands [1] the call
> > > trace [2] was thrown. This was because the kfree_scale_thread thread(s)
> > > still run after unloading rcuscale and torture modules. Fix the call
> > > trace by invoking kfree_scale_cleanup() from rcu_scale_cleanup() when
> > > removing the rcuscale module.
> > >
> > > [1] modprobe rcuscale kfree_rcu_test=1
> > > // After some time
> > > rmmod rcuscale
> > > rmmod torture
> > >
> > > [2] BUG: unable to handle page fault for address: ffffffffc0601a87
> > > #PF: supervisor instruction fetch in kernel mode
> > > #PF: error_code(0x0010) - not-present page
> > > PGD 11de4f067 P4D 11de4f067 PUD 11de51067 PMD 112f4d067 PTE 0
> > > Oops: 0010 [#1] PREEMPT SMP NOPTI
> > > CPU: 1 PID: 1798 Comm: kfree_scale_thr Not tainted 6.3.0-rc1-rcu+ #1
> > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
> > > RIP: 0010:0xffffffffc0601a87
> > > Code: Unable to access opcode bytes at 0xffffffffc0601a5d.
> > > RSP: 0018:ffffb25bc2e57e18 EFLAGS: 00010297
> > > RAX: 0000000000000000 RBX: ffffffffc061f0b6 RCX: 0000000000000000
> > > RDX: 0000000000000000 RSI: ffffffff962fd0de RDI: ffffffff962fd0de
> > > RBP: ffffb25bc2e57ea8 R08: 0000000000000000 R09: 0000000000000000
> > > R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
> > > R13: 0000000000000000 R14: 000000000000000a R15: 00000000001c1dbe
> > > FS: 0000000000000000(0000) GS:ffff921fa2200000(0000) knlGS:0000000000000000
> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: ffffffffc0601a5d CR3: 000000011de4c006 CR4: 0000000000370ee0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > Call Trace:
> > > <TASK>
> > > ? kvfree_call_rcu+0xf0/0x3a0
> > > ? kthread+0xf3/0x120
> > > ? kthread_complete_and_exit+0x20/0x20
> > > ? ret_from_fork+0x1f/0x30
> > > </TASK>
> > > Modules linked in: rfkill sunrpc ... [last unloaded: torture]
> > > CR2: ffffffffc0601a87
> > > ---[ end trace 0000000000000000 ]---
> > >
> > > Fixes: e6e78b004fa7 ("rcuperf: Add kfree_rcu() performance Tests")
> > > Signed-off-by: Qiuxu Zhuo <[email protected]>
> >
> > Reviewed-by: Davidlohr Bueso <[email protected]>
>
> Much better, thank you both!
>
> But unfortunately, these patches do not apply cleanly. Qiuxu Zhuo,
> could you please forward port these to the -rcu "dev" branch [1]?
After making it cleanly apply:
Reviewed-by: Joel Fernandes (Google) <[email protected]>
thanks,
- Joel
>
> Thanx, Paul
>
> [1] https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/rcutodo.html
> From: Paul E. McKenney <[email protected]>
> [...]
> > > Fixes: e6e78b004fa7 ("rcuperf: Add kfree_rcu() performance Tests")
> > > Signed-off-by: Qiuxu Zhuo <[email protected]>
> >
> > Reviewed-by: Davidlohr Bueso <[email protected]>
>
> Much better, thank you both!
>
> But unfortunately, these patches do not apply cleanly. Qiuxu Zhuo, could
> you please forward port these to the -rcu "dev" branch [1]?
>
Hi Paul,
OK.
I'll be making v4 patches rebased on the top of the -rcu "dev" branch.
Thanks for letting me know more about the RCU patch workflow.
Also thank you Davidlohr Bueso and Joel for reviewing the patches.
- Qiuxu
> Thanx, Paul
>
> [1]
> https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/rcutodo.
> html
On Tue, Mar 21, 2023 at 9:26 PM Zhuo, Qiuxu <[email protected]> wrote:
>
> > From: Paul E. McKenney <[email protected]>
> > [...]
> > > > Fixes: e6e78b004fa7 ("rcuperf: Add kfree_rcu() performance Tests")
> > > > Signed-off-by: Qiuxu Zhuo <[email protected]>
> > >
> > > Reviewed-by: Davidlohr Bueso <[email protected]>
> >
> > Much better, thank you both!
> >
> > But unfortunately, these patches do not apply cleanly. Qiuxu Zhuo, could
> > you please forward port these to the -rcu "dev" branch [1]?
> >
>
> Hi Paul,
>
> OK.
> I'll be making v4 patches rebased on the top of the -rcu "dev" branch.
> Thanks for letting me know more about the RCU patch workflow.
>
> Also thank you Davidlohr Bueso and Joel for reviewing the patches.
You're welcome and thanks for your interactions on the mailing list
and RCU interest. :-)
- Joel
v1 -> v2:
- Move rcu_scale_cleanup() after kfree_scale_cleanup() to eliminate the
declaration of kfree_scale_cleanup().
- Remove the unnecessary step "modprobe torture" from the commit message.
- Add the description for why move rcu_scale_cleanup() after
kfree_scale_cleanup() to the commit message.
v2 -> v3:
- Split the single v2 patch into two patches.
- Move the commit message description for why move rcu_scale_cleanup()
after kfree_scale_cleanup() to Patch 1.
v3 -> v4 (No function changes):
- Rebased Patch 1 & 2 on top of the rcu "-dev" branch.
- Added two "Reviewed-by" tags to the commit message of Patch 2.
Qiuxu Zhuo (2):
rcu/rcuscale: Move rcu_scale_*() after kfree_scale_cleanup()
rcu/rcuscale: Stop kfree_scale_thread thread(s) after unloading rcuscale
kernel/rcu/rcuscale.c | 199 ++++++++++++++++++++++--------------------
1 file changed, 102 insertions(+), 97 deletions(-)
base-commit: 86b3da94693e412ca43d996a7e6ed5766ef7a5f4
--
2.17.1
When running the 'kfree_rcu_test' test case with commands [1] the call
trace [2] was thrown. This was because the kfree_scale_thread thread(s)
still run after unloading rcuscale and torture modules. Fix the call
trace by invoking kfree_scale_cleanup() from rcu_scale_cleanup() when
removing the rcuscale module.
[1] modprobe rcuscale kfree_rcu_test=1
// After some time
rmmod rcuscale
rmmod torture
[2] BUG: unable to handle page fault for address: ffffffffc0601a87
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 11de4f067 P4D 11de4f067 PUD 11de51067 PMD 112f4d067 PTE 0
Oops: 0010 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 1798 Comm: kfree_scale_thr Not tainted 6.3.0-rc1-rcu+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
RIP: 0010:0xffffffffc0601a87
Code: Unable to access opcode bytes at 0xffffffffc0601a5d.
RSP: 0018:ffffb25bc2e57e18 EFLAGS: 00010297
RAX: 0000000000000000 RBX: ffffffffc061f0b6 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff962fd0de RDI: ffffffff962fd0de
RBP: ffffb25bc2e57ea8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 000000000000000a R15: 00000000001c1dbe
FS: 0000000000000000(0000) GS:ffff921fa2200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffc0601a5d CR3: 000000011de4c006 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? kvfree_call_rcu+0xf0/0x3a0
? kthread+0xf3/0x120
? kthread_complete_and_exit+0x20/0x20
? ret_from_fork+0x1f/0x30
</TASK>
Modules linked in: rfkill sunrpc ... [last unloaded: torture]
CR2: ffffffffc0601a87
---[ end trace 0000000000000000 ]---
Fixes: e6e78b004fa7 ("rcuperf: Add kfree_rcu() performance Tests")
Reviewed-by: Davidlohr Bueso <[email protected]>
Reviewed-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Qiuxu Zhuo <[email protected]>
---
kernel/rcu/rcuscale.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index 7e8965b0827a..d1221731c7cf 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -797,6 +797,11 @@ rcu_scale_cleanup(void)
if (gp_exp && gp_async)
SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
+ if (kfree_rcu_test) {
+ kfree_scale_cleanup();
+ return;
+ }
+
if (torture_cleanup_begin())
return;
if (!cur_ops) {
--
2.17.1
Move rcu_scale_cleanup() and rcu_scale_shutdown() functions after
kfree_scale_cleanup(). There are no function changes.
The intention of this moving is a preparation for a subsequent patch
that will fix a call-trace bug by invoking kfree_scale_cleanup() from
rcu_scale_cleanup() without the declaration of kfree_scale_cleanup().
Signed-off-by: Qiuxu Zhuo <[email protected]>
---
kernel/rcu/rcuscale.c | 194 +++++++++++++++++++++---------------------
1 file changed, 97 insertions(+), 97 deletions(-)
diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
index e82ec9f9a5d8..7e8965b0827a 100644
--- a/kernel/rcu/rcuscale.c
+++ b/kernel/rcu/rcuscale.c
@@ -522,89 +522,6 @@ rcu_scale_print_module_parms(struct rcu_scale_ops *cur_ops, const char *tag)
scale_type, tag, nrealreaders, nrealwriters, verbose, shutdown);
}
-static void
-rcu_scale_cleanup(void)
-{
- int i;
- int j;
- int ngps = 0;
- u64 *wdp;
- u64 *wdpp;
-
- /*
- * Would like warning at start, but everything is expedited
- * during the mid-boot phase, so have to wait till the end.
- */
- if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp)
- SCALEOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
- if (rcu_gp_is_normal() && gp_exp)
- SCALEOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
- if (gp_exp && gp_async)
- SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
-
- if (torture_cleanup_begin())
- return;
- if (!cur_ops) {
- torture_cleanup_end();
- return;
- }
-
- if (reader_tasks) {
- for (i = 0; i < nrealreaders; i++)
- torture_stop_kthread(rcu_scale_reader,
- reader_tasks[i]);
- kfree(reader_tasks);
- }
-
- if (writer_tasks) {
- for (i = 0; i < nrealwriters; i++) {
- torture_stop_kthread(rcu_scale_writer,
- writer_tasks[i]);
- if (!writer_n_durations)
- continue;
- j = writer_n_durations[i];
- pr_alert("%s%s writer %d gps: %d\n",
- scale_type, SCALE_FLAG, i, j);
- ngps += j;
- }
- pr_alert("%s%s start: %llu end: %llu duration: %llu gps: %d batches: %ld\n",
- scale_type, SCALE_FLAG,
- t_rcu_scale_writer_started, t_rcu_scale_writer_finished,
- t_rcu_scale_writer_finished -
- t_rcu_scale_writer_started,
- ngps,
- rcuscale_seq_diff(b_rcu_gp_test_finished,
- b_rcu_gp_test_started));
- for (i = 0; i < nrealwriters; i++) {
- if (!writer_durations)
- break;
- if (!writer_n_durations)
- continue;
- wdpp = writer_durations[i];
- if (!wdpp)
- continue;
- for (j = 0; j < writer_n_durations[i]; j++) {
- wdp = &wdpp[j];
- pr_alert("%s%s %4d writer-duration: %5d %llu\n",
- scale_type, SCALE_FLAG,
- i, j, *wdp);
- if (j % 100 == 0)
- schedule_timeout_uninterruptible(1);
- }
- kfree(writer_durations[i]);
- }
- kfree(writer_tasks);
- kfree(writer_durations);
- kfree(writer_n_durations);
- }
-
- /* Do torture-type-specific cleanup operations. */
- if (cur_ops->cleanup != NULL)
- cur_ops->cleanup();
-
- torture_cleanup_end();
-}
-
/*
* Return the number if non-negative. If -1, the number of CPUs.
* If less than -1, that much less than the number of CPUs, but
@@ -624,20 +541,6 @@ static int compute_real(int n)
return nr;
}
-/*
- * RCU scalability shutdown kthread. Just waits to be awakened, then shuts
- * down system.
- */
-static int
-rcu_scale_shutdown(void *arg)
-{
- wait_event_idle(shutdown_wq, atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters);
- smp_mb(); /* Wake before output. */
- rcu_scale_cleanup();
- kernel_power_off();
- return -EINVAL;
-}
-
/*
* kfree_rcu() scalability tests: Start a kfree_rcu() loop on all CPUs for number
* of iterations and measure total time and number of GP for all iterations to complete.
@@ -874,6 +777,103 @@ kfree_scale_init(void)
return firsterr;
}
+static void
+rcu_scale_cleanup(void)
+{
+ int i;
+ int j;
+ int ngps = 0;
+ u64 *wdp;
+ u64 *wdpp;
+
+ /*
+ * Would like warning at start, but everything is expedited
+ * during the mid-boot phase, so have to wait till the end.
+ */
+ if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp)
+ SCALEOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
+ if (rcu_gp_is_normal() && gp_exp)
+ SCALEOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
+ if (gp_exp && gp_async)
+ SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
+
+ if (torture_cleanup_begin())
+ return;
+ if (!cur_ops) {
+ torture_cleanup_end();
+ return;
+ }
+
+ if (reader_tasks) {
+ for (i = 0; i < nrealreaders; i++)
+ torture_stop_kthread(rcu_scale_reader,
+ reader_tasks[i]);
+ kfree(reader_tasks);
+ }
+
+ if (writer_tasks) {
+ for (i = 0; i < nrealwriters; i++) {
+ torture_stop_kthread(rcu_scale_writer,
+ writer_tasks[i]);
+ if (!writer_n_durations)
+ continue;
+ j = writer_n_durations[i];
+ pr_alert("%s%s writer %d gps: %d\n",
+ scale_type, SCALE_FLAG, i, j);
+ ngps += j;
+ }
+ pr_alert("%s%s start: %llu end: %llu duration: %llu gps: %d batches: %ld\n",
+ scale_type, SCALE_FLAG,
+ t_rcu_scale_writer_started, t_rcu_scale_writer_finished,
+ t_rcu_scale_writer_finished -
+ t_rcu_scale_writer_started,
+ ngps,
+ rcuscale_seq_diff(b_rcu_gp_test_finished,
+ b_rcu_gp_test_started));
+ for (i = 0; i < nrealwriters; i++) {
+ if (!writer_durations)
+ break;
+ if (!writer_n_durations)
+ continue;
+ wdpp = writer_durations[i];
+ if (!wdpp)
+ continue;
+ for (j = 0; j < writer_n_durations[i]; j++) {
+ wdp = &wdpp[j];
+ pr_alert("%s%s %4d writer-duration: %5d %llu\n",
+ scale_type, SCALE_FLAG,
+ i, j, *wdp);
+ if (j % 100 == 0)
+ schedule_timeout_uninterruptible(1);
+ }
+ kfree(writer_durations[i]);
+ }
+ kfree(writer_tasks);
+ kfree(writer_durations);
+ kfree(writer_n_durations);
+ }
+
+ /* Do torture-type-specific cleanup operations. */
+ if (cur_ops->cleanup != NULL)
+ cur_ops->cleanup();
+
+ torture_cleanup_end();
+}
+
+/*
+ * RCU scalability shutdown kthread. Just waits to be awakened, then shuts
+ * down system.
+ */
+static int
+rcu_scale_shutdown(void *arg)
+{
+ wait_event_idle(shutdown_wq, atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters);
+ smp_mb(); /* Wake before output. */
+ rcu_scale_cleanup();
+ kernel_power_off();
+ return -EINVAL;
+}
+
static int __init
rcu_scale_init(void)
{
--
2.17.1
On Wed, Mar 22, 2023 at 07:42:41PM +0800, Qiuxu Zhuo wrote:
> When running the 'kfree_rcu_test' test case with commands [1] the call
> trace [2] was thrown. This was because the kfree_scale_thread thread(s)
> still run after unloading rcuscale and torture modules. Fix the call
> trace by invoking kfree_scale_cleanup() from rcu_scale_cleanup() when
> removing the rcuscale module.
>
> [1] modprobe rcuscale kfree_rcu_test=1
> // After some time
> rmmod rcuscale
> rmmod torture
>
> [2] BUG: unable to handle page fault for address: ffffffffc0601a87
> #PF: supervisor instruction fetch in kernel mode
> #PF: error_code(0x0010) - not-present page
> PGD 11de4f067 P4D 11de4f067 PUD 11de51067 PMD 112f4d067 PTE 0
> Oops: 0010 [#1] PREEMPT SMP NOPTI
> CPU: 1 PID: 1798 Comm: kfree_scale_thr Not tainted 6.3.0-rc1-rcu+ #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
> RIP: 0010:0xffffffffc0601a87
> Code: Unable to access opcode bytes at 0xffffffffc0601a5d.
> RSP: 0018:ffffb25bc2e57e18 EFLAGS: 00010297
> RAX: 0000000000000000 RBX: ffffffffc061f0b6 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffffff962fd0de RDI: ffffffff962fd0de
> RBP: ffffb25bc2e57ea8 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
> R13: 0000000000000000 R14: 000000000000000a R15: 00000000001c1dbe
> FS: 0000000000000000(0000) GS:ffff921fa2200000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffc0601a5d CR3: 000000011de4c006 CR4: 0000000000370ee0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> ? kvfree_call_rcu+0xf0/0x3a0
> ? kthread+0xf3/0x120
> ? kthread_complete_and_exit+0x20/0x20
> ? ret_from_fork+0x1f/0x30
> </TASK>
> Modules linked in: rfkill sunrpc ... [last unloaded: torture]
> CR2: ffffffffc0601a87
> ---[ end trace 0000000000000000 ]---
>
> Fixes: e6e78b004fa7 ("rcuperf: Add kfree_rcu() performance Tests")
> Reviewed-by: Davidlohr Bueso <[email protected]>
> Reviewed-by: Joel Fernandes (Google) <[email protected]>
> Signed-off-by: Qiuxu Zhuo <[email protected]>
Much better, thank you!
I queued and pushed both of them.
Thanx, Paul
> ---
> kernel/rcu/rcuscale.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
> index 7e8965b0827a..d1221731c7cf 100644
> --- a/kernel/rcu/rcuscale.c
> +++ b/kernel/rcu/rcuscale.c
> @@ -797,6 +797,11 @@ rcu_scale_cleanup(void)
> if (gp_exp && gp_async)
> SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
>
> + if (kfree_rcu_test) {
> + kfree_scale_cleanup();
> + return;
> + }
> +
> if (torture_cleanup_begin())
> return;
> if (!cur_ops) {
> --
> 2.17.1
>