Dear RT Folks,
This is the RT stable review cycle of patch 4.1.42-rt50-rc1. Please review
the included patches, and test!
The -rc release will be uploaded to kernel.org and will be deleted when the
final release is out. This is just a review release (or release candidate).
The pre-releases will not be pushed to the git repository, only the
final release is.
If all goes well, this patch will be converted to the next main release
on 8/14/2017.
Julia
----------------------------------------------------------------
To build 4.1.42-rt50-rc1 directly, the following patches should be applied:
http://www.kernel.org/pub/linux/kernel/v4.x/linux-4.1.tar.xz
http://www.kernel.org/pub/linux/kernel/v4.x/patch-4.1.42.xz
http://www.kernel.org/pub/linux/kernel/projects/rt/4.1/patch-4.1.42-rt50-rc1.patch.xz
You can also build from 4.1.42-rt49 release by applying the incremental patch:
http://www.kernel.org/pub/linux/kernel/projects/rt/4.1/incr/patch-4.1.42-rt49-rt50-rc1.patch.xz
----------------------------------------------------------------
Alex Shi (1):
cpu_pm: replace raw_notifier to atomic_notifier
Julia Cartwright (1):
Linux 4.1.42-rt50-rc1
Peter Zijlstra (2):
lockdep: Fix per-cpu static objects
sched: Remove TASK_ALL
Thomas Gleixner (2):
rtmutex: Make lock_killable work
sched: Prevent task state corruption by spurious lock wakeup
include/linux/sched.h | 1 -
include/linux/smp.h | 12 ++++++++++++
init/main.c | 8 ++++++++
kernel/cpu_pm.c | 43 ++++++-------------------------------------
kernel/locking/rtmutex.c | 19 +++++++------------
kernel/module.c | 6 +++++-
kernel/sched/core.c | 2 +-
localversion-rt | 2 +-
mm/percpu.c | 5 ++++-
9 files changed, 44 insertions(+), 54 deletions(-)
--
2.13.1
4.1.42-rt50-rc1 stable review patch.
If you have any objection to the inclusion of this patch, let me know.
--- 8< --- 8< --- 8< ---
From: Thomas Gleixner <[email protected]>
Mathias and others reported GDB failures on RT.
The following scenario leads to task state corruption:
CPU0 CPU1
T1->state = TASK_XXX;
spin_lock(&lock)
rt_spin_lock_slowlock(&lock->rtmutex)
raw_spin_lock(&rtm->wait_lock);
T1->saved_state = current->state;
T1->state = TASK_UNINTERRUPTIBLE;
spin_unlock(&lock)
task_blocks_on_rt_mutex(rtm) rt_spin_lock_slowunlock(&lock->rtmutex)
queue_waiter(rtm) raw_spin_lock(&rtm->wait_lock);
pi_chain_walk(rtm)
raw_spin_unlock(&rtm->wait_lock);
wake_top_waiter(T1)
raw_spin_lock(&rtm->wait_lock);
for (;;) {
if (__try_to_take_rt_mutex()) <- Succeeds
break;
...
}
T1->state = T1->saved_state;
try_to_wake_up(T1)
ttwu_do_wakeup(T1)
T1->state = TASK_RUNNING;
In most cases this is harmless because waiting for some event, which is the
usual reason for TASK_[UN]INTERRUPTIBLE has to be safe against other forms
of spurious wakeups anyway.
But in case of TASK_TRACED this is actually fatal, because the task loses
the TASK_TRACED state. In consequence it fails to consume SIGSTOP which was
sent from the debugger and actually delivers SIGSTOP to the task which
breaks the ptrace mechanics and brings the debugger into an unexpected
state.
The TASK_TRACED state should prevent getting there due to the state
matching logic in try_to_wake_up(). But that's not true because
wake_up_lock_sleeper() uses TASK_ALL as state mask. That's bogus because
lock sleepers always use TASK_UNINTERRUPTIBLE, so the wakeup should use
that as well.
The cure is way simpler as figuring it out:
Change the mask used in wake_up_lock_sleeper() from TASK_ALL to
TASK_UNINTERRUPTIBLE.
Cc: [email protected]
Reported-by: Mathias Koehrer <[email protected]>
Reported-by: David Hauck <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
(cherry picked from commit 2f9f24e15088d2ef3244d088a9604d7e98c9c625)
Signed-off-by: Julia Cartwright <[email protected]>
---
kernel/sched/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 0d3a40b24304..ee11a59e53ff 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1876,7 +1876,7 @@ EXPORT_SYMBOL(wake_up_process);
*/
int wake_up_lock_sleeper(struct task_struct *p)
{
- return try_to_wake_up(p, TASK_ALL, WF_LOCK_SLEEPER);
+ return try_to_wake_up(p, TASK_UNINTERRUPTIBLE, WF_LOCK_SLEEPER);
}
int wake_up_state(struct task_struct *p, unsigned int state)
--
2.13.1
4.1.42-rt50-rc1 stable review patch.
If you have any objection to the inclusion of this patch, let me know.
--- 8< --- 8< --- 8< ---
From: Alex Shi <[email protected]>
This patch replace a rwlock and raw notifier by atomic notifier which
protected by spin_lock and rcu.
The first to reason to have this replace is due to a 'scheduling while
atomic' bug of RT kernel on arm/arm64 platform. On arm/arm64, rwlock
cpu_pm_notifier_lock in cpu_pm cause a potential schedule after irq
disable in idle call chain:
cpu_startup_entry
cpu_idle_loop
local_irq_disable()
cpuidle_idle_call
call_cpuidle
cpuidle_enter
cpuidle_enter_state
->enter :arm_enter_idle_state
cpu_pm_enter/exit
CPU_PM_CPU_IDLE_ENTER
read_lock(&cpu_pm_notifier_lock); <-- sleep in idle
__rt_spin_lock();
schedule();
The kernel panic is here:
[ 4.609601] BUG: scheduling while atomic: swapper/1/0/0x00000002
[ 4.609608] [<ffff0000086fae70>] arm_enter_idle_state+0x18/0x70
[ 4.609614] Modules linked in:
[ 4.609615] [<ffff0000086f9298>] cpuidle_enter_state+0xf0/0x218
[ 4.609620] [<ffff0000086f93f8>] cpuidle_enter+0x18/0x20
[ 4.609626] Preemption disabled at:
[ 4.609627] [<ffff0000080fa234>] call_cpuidle+0x24/0x40
[ 4.609635] [<ffff000008882fa4>] schedule_preempt_disabled+0x1c/0x28
[ 4.609639] [<ffff0000080fa49c>] cpu_startup_entry+0x154/0x1f8
[ 4.609645] [<ffff00000808e004>] secondary_start_kernel+0x15c/0x1a0
Daniel Lezcano said this notification is needed on arm/arm64 platforms.
Sebastian suggested using atomic_notifier instead of rwlock, which is not
only removing the sleeping in idle, but also getting better latency
improvement.
This patch passed Fengguang's 0day testing.
Signed-off-by: Alex Shi <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Anders Roxell <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Daniel Lezcano <[email protected]>
Cc: linux-rt-users <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
(cherry picked from commit df0fba5ba4c69cdc68bdaa5ca7a4100d959fdd07)
Signed-off-by: Julia Cartwright <[email protected]>
---
kernel/cpu_pm.c | 43 ++++++-------------------------------------
1 file changed, 6 insertions(+), 37 deletions(-)
diff --git a/kernel/cpu_pm.c b/kernel/cpu_pm.c
index 9656a3c36503..9da42f83ee03 100644
--- a/kernel/cpu_pm.c
+++ b/kernel/cpu_pm.c
@@ -22,14 +22,13 @@
#include <linux/spinlock.h>
#include <linux/syscore_ops.h>
-static DEFINE_RWLOCK(cpu_pm_notifier_lock);
-static RAW_NOTIFIER_HEAD(cpu_pm_notifier_chain);
+static ATOMIC_NOTIFIER_HEAD(cpu_pm_notifier_chain);
static int cpu_pm_notify(enum cpu_pm_event event, int nr_to_call, int *nr_calls)
{
int ret;
- ret = __raw_notifier_call_chain(&cpu_pm_notifier_chain, event, NULL,
+ ret = __atomic_notifier_call_chain(&cpu_pm_notifier_chain, event, NULL,
nr_to_call, nr_calls);
return notifier_to_errno(ret);
@@ -47,14 +46,7 @@ static int cpu_pm_notify(enum cpu_pm_event event, int nr_to_call, int *nr_calls)
*/
int cpu_pm_register_notifier(struct notifier_block *nb)
{
- unsigned long flags;
- int ret;
-
- write_lock_irqsave(&cpu_pm_notifier_lock, flags);
- ret = raw_notifier_chain_register(&cpu_pm_notifier_chain, nb);
- write_unlock_irqrestore(&cpu_pm_notifier_lock, flags);
-
- return ret;
+ return atomic_notifier_chain_register(&cpu_pm_notifier_chain, nb);
}
EXPORT_SYMBOL_GPL(cpu_pm_register_notifier);
@@ -69,14 +61,7 @@ EXPORT_SYMBOL_GPL(cpu_pm_register_notifier);
*/
int cpu_pm_unregister_notifier(struct notifier_block *nb)
{
- unsigned long flags;
- int ret;
-
- write_lock_irqsave(&cpu_pm_notifier_lock, flags);
- ret = raw_notifier_chain_unregister(&cpu_pm_notifier_chain, nb);
- write_unlock_irqrestore(&cpu_pm_notifier_lock, flags);
-
- return ret;
+ return atomic_notifier_chain_unregister(&cpu_pm_notifier_chain, nb);
}
EXPORT_SYMBOL_GPL(cpu_pm_unregister_notifier);
@@ -100,7 +85,6 @@ int cpu_pm_enter(void)
int nr_calls;
int ret = 0;
- read_lock(&cpu_pm_notifier_lock);
ret = cpu_pm_notify(CPU_PM_ENTER, -1, &nr_calls);
if (ret)
/*
@@ -108,7 +92,6 @@ int cpu_pm_enter(void)
* PM entry who are notified earlier to prepare for it.
*/
cpu_pm_notify(CPU_PM_ENTER_FAILED, nr_calls - 1, NULL);
- read_unlock(&cpu_pm_notifier_lock);
return ret;
}
@@ -128,13 +111,7 @@ EXPORT_SYMBOL_GPL(cpu_pm_enter);
*/
int cpu_pm_exit(void)
{
- int ret;
-
- read_lock(&cpu_pm_notifier_lock);
- ret = cpu_pm_notify(CPU_PM_EXIT, -1, NULL);
- read_unlock(&cpu_pm_notifier_lock);
-
- return ret;
+ return cpu_pm_notify(CPU_PM_EXIT, -1, NULL);
}
EXPORT_SYMBOL_GPL(cpu_pm_exit);
@@ -159,7 +136,6 @@ int cpu_cluster_pm_enter(void)
int nr_calls;
int ret = 0;
- read_lock(&cpu_pm_notifier_lock);
ret = cpu_pm_notify(CPU_CLUSTER_PM_ENTER, -1, &nr_calls);
if (ret)
/*
@@ -167,7 +143,6 @@ int cpu_cluster_pm_enter(void)
* PM entry who are notified earlier to prepare for it.
*/
cpu_pm_notify(CPU_CLUSTER_PM_ENTER_FAILED, nr_calls - 1, NULL);
- read_unlock(&cpu_pm_notifier_lock);
return ret;
}
@@ -190,13 +165,7 @@ EXPORT_SYMBOL_GPL(cpu_cluster_pm_enter);
*/
int cpu_cluster_pm_exit(void)
{
- int ret;
-
- read_lock(&cpu_pm_notifier_lock);
- ret = cpu_pm_notify(CPU_CLUSTER_PM_EXIT, -1, NULL);
- read_unlock(&cpu_pm_notifier_lock);
-
- return ret;
+ return cpu_pm_notify(CPU_CLUSTER_PM_EXIT, -1, NULL);
}
EXPORT_SYMBOL_GPL(cpu_cluster_pm_exit);
--
2.13.1
4.1.42-rt50-rc1 stable review patch.
If you have any objection to the inclusion of this patch, let me know.
--- 8< --- 8< --- 8< ---
From: Peter Zijlstra <[email protected]>
It's unused:
$ git grep "\<TASK_ALL\>" | wc -l
1
And dangerous, kill the bugger.
Cc: [email protected]
Acked-by: Thomas Gleixner <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
(cherry picked from commit a1762267d95649bddf6e94e7e3305e0207d0fff0)
Signed-off-by: Julia Cartwright <[email protected]>
---
include/linux/sched.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index d51525ce2c41..7587d6181cd2 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -227,7 +227,6 @@ extern char ___assert_task_state[1 - 2*!!(
/* Convenience macros for the sake of wake_up */
#define TASK_NORMAL (TASK_INTERRUPTIBLE | TASK_UNINTERRUPTIBLE)
-#define TASK_ALL (TASK_NORMAL | __TASK_STOPPED | __TASK_TRACED)
/* get_task_state() */
#define TASK_REPORT (TASK_RUNNING | TASK_INTERRUPTIBLE | \
--
2.13.1
4.1.42-rt50-rc1 stable review patch.
If you have any objection to the inclusion of this patch, let me know.
--- 8< --- 8< --- 8< ---
From: Peter Zijlstra <[email protected]>
Since commit 383776fa7527 ("locking/lockdep: Handle statically initialized
PER_CPU locks properly") we try to collapse per-cpu locks into a single
class by giving them all the same key. For this key we choose the canonical
address of the per-cpu object, which would be the offset into the per-cpu
area.
This has two problems:
- there is a case where we run !0 lock->key through static_obj() and
expect this to pass; it doesn't for canonical pointers.
- 0 is a valid canonical address.
Cure both issues by redefining the canonical address as the address of the
per-cpu variable on the boot CPU.
Since I didn't want to rely on CPU0 being the boot-cpu, or even existing at
all, track the boot CPU in a variable.
Fixes: 383776fa7527 ("locking/lockdep: Handle statically initialized PER_CPU locks properly")
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: Borislav Petkov <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: kernel test robot <[email protected]>
Cc: LKP <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
(cherry picked from commit c9fe9196079f738c89c3ffcdce3fbe142ac3f3c4)
Signed-off-by: Julia Cartwright <[email protected]>
---
include/linux/smp.h | 12 ++++++++++++
init/main.c | 8 ++++++++
kernel/module.c | 6 +++++-
mm/percpu.c | 5 ++++-
4 files changed, 29 insertions(+), 2 deletions(-)
diff --git a/include/linux/smp.h b/include/linux/smp.h
index e6ab36aeaaab..cbf6836524dc 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -120,6 +120,13 @@ extern unsigned int setup_max_cpus;
extern void __init setup_nr_cpu_ids(void);
extern void __init smp_init(void);
+extern int __boot_cpu_id;
+
+static inline int get_boot_cpu_id(void)
+{
+ return __boot_cpu_id;
+}
+
#else /* !SMP */
static inline void smp_send_stop(void) { }
@@ -158,6 +165,11 @@ static inline void smp_init(void) { up_late_init(); }
static inline void smp_init(void) { }
#endif
+static inline int get_boot_cpu_id(void)
+{
+ return 0;
+}
+
#endif /* !SMP */
/*
diff --git a/init/main.c b/init/main.c
index 0486a8e11fc0..e1bae15a2154 100644
--- a/init/main.c
+++ b/init/main.c
@@ -451,6 +451,10 @@ void __init parse_early_param(void)
* Activate the first processor.
*/
+#ifdef CONFIG_SMP
+int __boot_cpu_id;
+#endif
+
static void __init boot_cpu_init(void)
{
int cpu = smp_processor_id();
@@ -459,6 +463,10 @@ static void __init boot_cpu_init(void)
set_cpu_active(cpu, true);
set_cpu_present(cpu, true);
set_cpu_possible(cpu, true);
+
+#ifdef CONFIG_SMP
+ __boot_cpu_id = cpu;
+#endif
}
void __init __weak smp_setup_processor_id(void)
diff --git a/kernel/module.c b/kernel/module.c
index a7ac858fd1a1..982c57b2c2a1 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -542,8 +542,12 @@ bool __is_module_percpu_address(unsigned long addr, unsigned long *can_addr)
void *va = (void *)addr;
if (va >= start && va < start + mod->percpu_size) {
- if (can_addr)
+ if (can_addr) {
*can_addr = (unsigned long) (va - start);
+ *can_addr += (unsigned long)
+ per_cpu_ptr(mod->percpu,
+ get_boot_cpu_id());
+ }
preempt_enable();
return true;
}
diff --git a/mm/percpu.c b/mm/percpu.c
index 4146b00bfde7..b41c3960d5fb 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -1297,8 +1297,11 @@ bool __is_kernel_percpu_address(unsigned long addr, unsigned long *can_addr)
void *va = (void *)addr;
if (va >= start && va < start + static_size) {
- if (can_addr)
+ if (can_addr) {
*can_addr = (unsigned long) (va - start);
+ *can_addr += (unsigned long)
+ per_cpu_ptr(base, get_boot_cpu_id());
+ }
return true;
}
}
--
2.13.1
4.1.42-rt50-rc1 stable review patch.
If you have any objection to the inclusion of this patch, let me know.
--- 8< --- 8< --- 8< ---
---
localversion-rt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/localversion-rt b/localversion-rt
index 4b7dca68a5b4..e8a9a36bb066 100644
--- a/localversion-rt
+++ b/localversion-rt
@@ -1 +1 @@
--rt49
+-rt50-rc1
--
2.13.1
4.1.42-rt50-rc1 stable review patch.
If you have any objection to the inclusion of this patch, let me know.
--- 8< --- 8< --- 8< ---
From: Thomas Gleixner <[email protected]>
Locking an rt mutex killable does not work because signal handling is
restricted to TASK_INTERRUPTIBLE.
Use signal_pending_state() unconditionaly.
Cc: [email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
(cherry picked from commit 9edcb2cd71ff3684755f52129538260efa382789)
Signed-off-by: Julia Cartwright <[email protected]>
---
kernel/locking/rtmutex.c | 19 +++++++------------
1 file changed, 7 insertions(+), 12 deletions(-)
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index e0b0d9b419b5..3e45ceb862bd 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1511,18 +1511,13 @@ __rt_mutex_slowlock(struct rt_mutex *lock, int state,
if (try_to_take_rt_mutex(lock, current, waiter))
break;
- /*
- * TASK_INTERRUPTIBLE checks for signals and
- * timeout. Ignored otherwise.
- */
- if (unlikely(state == TASK_INTERRUPTIBLE)) {
- /* Signal pending? */
- if (signal_pending(current))
- ret = -EINTR;
- if (timeout && !timeout->task)
- ret = -ETIMEDOUT;
- if (ret)
- break;
+ if (timeout && !timeout->task) {
+ ret = -ETIMEDOUT;
+ break;
+ }
+ if (signal_pending_state(state, current)) {
+ ret = -EINTR;
+ break;
}
if (ww_ctx && ww_ctx->acquired > 0) {
--
2.13.1