2020-01-17 17:44:10

by Steven Rostedt

[permalink] [raw]
Subject: [PATCH RT 24/32] sched: migrate_enable: Use stop_one_cpu_nowait()

4.19.94-rt39-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Scott Wood <[email protected]>

[ Upstream commit 6b39a1fa8c53cae08dc03afdae193b7d3a78a173 ]

migrate_enable() can be called with current->state != TASK_RUNNING.
Avoid clobbering the existing state by using stop_one_cpu_nowait().
Since we're stopping the current cpu, we know that we won't get
past __schedule() until migration_cpu_stop() has run (at least up to
the point of migrating us to another cpu).

Signed-off-by: Scott Wood <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
---
include/linux/stop_machine.h | 2 ++
kernel/sched/core.c | 23 +++++++++++++----------
kernel/stop_machine.c | 7 +++++--
3 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h
index 6d3635c86dbe..82fc686ddd9e 100644
--- a/include/linux/stop_machine.h
+++ b/include/linux/stop_machine.h
@@ -26,6 +26,8 @@ struct cpu_stop_work {
cpu_stop_fn_t fn;
void *arg;
struct cpu_stop_done *done;
+ /* Did not run due to disabled stopper; for nowait debug checks */
+ bool disabled;
};

int stop_one_cpu(unsigned int cpu, cpu_stop_fn_t fn, void *arg);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e97ac751aad2..e465381b464d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -990,6 +990,7 @@ static struct rq *move_queued_task(struct rq *rq, struct rq_flags *rf,
struct migration_arg {
struct task_struct *task;
int dest_cpu;
+ bool done;
};

/*
@@ -1025,6 +1026,11 @@ static int migration_cpu_stop(void *data)
struct task_struct *p = arg->task;
struct rq *rq = this_rq();
struct rq_flags rf;
+ int dest_cpu = arg->dest_cpu;
+
+ /* We don't look at arg after this point. */
+ smp_mb();
+ arg->done = true;

/*
* The original target CPU might have gone down and we might
@@ -1047,9 +1053,9 @@ static int migration_cpu_stop(void *data)
*/
if (task_rq(p) == rq) {
if (task_on_rq_queued(p))
- rq = __migrate_task(rq, &rf, p, arg->dest_cpu);
+ rq = __migrate_task(rq, &rf, p, dest_cpu);
else
- p->wake_cpu = arg->dest_cpu;
+ p->wake_cpu = dest_cpu;
}
rq_unlock(rq, &rf);
raw_spin_unlock(&p->pi_lock);
@@ -7322,6 +7328,7 @@ void migrate_enable(void)
WARN_ON(smp_processor_id() != cpu);
if (!is_cpu_allowed(p, cpu)) {
struct migration_arg arg = { p };
+ struct cpu_stop_work work;
struct rq_flags rf;

rq = task_rq_lock(p, &rf);
@@ -7329,15 +7336,11 @@ void migrate_enable(void)
arg.dest_cpu = select_fallback_rq(cpu, p);
task_rq_unlock(rq, p, &rf);

- preempt_lazy_enable();
- preempt_enable();
-
- sleeping_lock_inc();
- stop_one_cpu(task_cpu(p), migration_cpu_stop, &arg);
- sleeping_lock_dec();
+ stop_one_cpu_nowait(task_cpu(p), migration_cpu_stop,
+ &arg, &work);
tlb_migrate_finish(p->mm);
-
- return;
+ __schedule(true);
+ WARN_ON_ONCE(!arg.done && !work.disabled);
}

out:
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 067cb83f37ea..2d15c0d50625 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -86,8 +86,11 @@ static bool cpu_stop_queue_work(unsigned int cpu, struct cpu_stop_work *work)
enabled = stopper->enabled;
if (enabled)
__cpu_stop_queue_work(stopper, work, &wakeq);
- else if (work->done)
- cpu_stop_signal_done(work->done);
+ else {
+ work->disabled = true;
+ if (work->done)
+ cpu_stop_signal_done(work->done);
+ }
raw_spin_unlock_irqrestore(&stopper->lock, flags);

wake_up_q(&wakeq);
--
2.24.1



Subject: Re: [PATCH RT 24/32] sched: migrate_enable: Use stop_one_cpu_nowait()

On 2020-01-17 12:41:35 [-0500], Steven Rostedt wrote:
> 4.19.94-rt39-rc1 stable review patch.
> If anyone has any objections, please let me know.

I don't know how much of this patch and the previous is classified as
"new feature" vs "bug fix". This patch requires patch 31 (of this series)
as bug fix.
I'm not against it, just pointing out.

Sebastian

2020-01-22 11:34:18

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH RT 24/32] sched: migrate_enable: Use stop_one_cpu_nowait()

On Wed, 22 Jan 2020 09:31:30 +0100
Sebastian Andrzej Siewior <[email protected]> wrote:

> On 2020-01-17 12:41:35 [-0500], Steven Rostedt wrote:
> > 4.19.94-rt39-rc1 stable review patch.
> > If anyone has any objections, please let me know.
>
> I don't know how much of this patch and the previous is classified as
> "new feature" vs "bug fix". This patch requires patch 31 (of this series)
> as bug fix.
> I'm not against it, just pointing out.
>

Hmm, the description looked more of a bug fix than a new feature.

-- Steve

2020-01-22 18:41:35

by Crystal Wood

[permalink] [raw]
Subject: Re: [PATCH RT 24/32] sched: migrate_enable: Use stop_one_cpu_nowait()

On Wed, 2020-01-22 at 06:33 -0500, Steven Rostedt wrote:
> On Wed, 22 Jan 2020 09:31:30 +0100
> Sebastian Andrzej Siewior <[email protected]> wrote:
>
> > On 2020-01-17 12:41:35 [-0500], Steven Rostedt wrote:
> > > 4.19.94-rt39-rc1 stable review patch.
> > > If anyone has any objections, please let me know.
> >
> > I don't know how much of this patch and the previous is classified as
> > "new feature" vs "bug fix". This patch requires patch 31 (of this
> > series)
> > as bug fix.
> > I'm not against it, just pointing out.
> >
>
> Hmm, the description looked more of a bug fix than a new feature.

Yes, the state clobber was an existing bug. Patch 23 was a performance
improvement rather than a bugfix, though.

-Scott


2020-01-22 19:09:29

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH RT 24/32] sched: migrate_enable: Use stop_one_cpu_nowait()

On Wed, 22 Jan 2020 12:39:14 -0600
Scott Wood <[email protected]> wrote:

> On Wed, 2020-01-22 at 06:33 -0500, Steven Rostedt wrote:
> > On Wed, 22 Jan 2020 09:31:30 +0100
> > Sebastian Andrzej Siewior <[email protected]> wrote:
> >
> > > On 2020-01-17 12:41:35 [-0500], Steven Rostedt wrote:
> > > > 4.19.94-rt39-rc1 stable review patch.
> > > > If anyone has any objections, please let me know.
> > >
> > > I don't know how much of this patch and the previous is classified as
> > > "new feature" vs "bug fix". This patch requires patch 31 (of this
> > > series)
> > > as bug fix.
> > > I'm not against it, just pointing out.
> > >
> >
> > Hmm, the description looked more of a bug fix than a new feature.
>
> Yes, the state clobber was an existing bug. Patch 23 was a performance
> improvement rather than a bugfix, though.
>

Some people consider performance improvements bug fixes ;-)

-- Steve