2016-03-01 19:42:14

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH v1] kasan, arm64: Unpoison dirty stack frames when resuming from suspend.

On Fri, Feb 26, 2016 at 06:28:27PM +0100, Alexander Potapenko wrote:
> On Fri, Feb 26, 2016 at 2:53 PM, Mark Rutland <[email protected]> wrote:
> > Hi,
> >
> > On Fri, Feb 26, 2016 at 01:38:37PM +0100, Alexander Potapenko wrote:
> >> Before an ARM64 CPU is suspended, the kernel saves the context which will
> >> be used to initialize the register state upon resume. After that and
> >> before the actual execution of the SMC instruction the kernel creates
> >> several stack frames which are never unpoisoned because arm_smccc_smc()
> >> does not return. This may cause false positive stack buffer overflow
> >> reports from KASAN.
> >>
> >> The solution is to record the stack pointer value just before the CPU is
> >> suspended, and unpoison the part of stack between the saved value and
> >> the stack pointer upon resume.
> >
> > Thanks for looking into this! That's much appreciated.
> >
> > I think the general approach (unposioning the stack upon cold return to
> > the kernel) is fine, but I have concerns with the implementation, which
> > I've noted below.

For the idle case I intend to respin my patch [1] which calls
kasan_unpoison_shadow from assembly in the resume path, as I think
that's the only reliable approach.

> > The problem also applies for hotplug, as leftover poison from the
> > hot-unplug path isn't cleaned before a CPU is hotplugged back on. The
> > first few functions are likely deterministic in their stack usage, so
> > it's not seen with a defconfig, but I think it's possible to trigger,
> > and it's also a cross-architecture problem shared with x86.
> Agreed, but since I haven't yet seen problems with hotplug, it's hard
> to test the fix for them.

For the common hotplug case, how about the below?

I've given it a spin locally on arm64 with the reproducer I posted
earlier.

Thanks,
Mark.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-February/409466.html

---->8----
>From 34839286826c88338cd91a142b1bcc3c077a87aa Mon Sep 17 00:00:00 2001
From: Mark Rutland <[email protected]>
Date: Tue, 1 Mar 2016 19:27:23 +0000
Subject: [PATCH] sched/kasan: clear stale stack poison

CPUs get hotplugged out some levels deep in C code, and hence when KASAN
is in use, the instrumented function preambles will have left the stack
shadow area poisoned.

This poison is not cleared, so when a CPU re-enters the kernel, it is
possible for accesses in instrumented functions to hit this stale
poison, resulting in (spurious) KASAN splats.

This patch forcefully unpoisons an idle task's stack shadow when it is
re-initialised prior to a hotplug, avoiding spurious hits against stale
poison.

Signed-off-by: Mark Rutland <[email protected]>
---
include/linux/kasan.h | 4 ++++
kernel/sched/core.c | 3 +++
mm/kasan/kasan.c | 10 ++++++++++
3 files changed, 17 insertions(+)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 4b9f85c..e00486f 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -43,6 +43,8 @@ static inline void kasan_disable_current(void)

void kasan_unpoison_shadow(const void *address, size_t size);

+void kasan_unpoison_task_stack(struct task_struct *idle);
+
void kasan_alloc_pages(struct page *page, unsigned int order);
void kasan_free_pages(struct page *page, unsigned int order);

@@ -66,6 +68,8 @@ void kasan_free_shadow(const struct vm_struct *vm);

static inline void kasan_unpoison_shadow(const void *address, size_t size) {}

+static inline void kasan_unpoison_task_stack(struct task_struct *idle) {}
+
static inline void kasan_enable_current(void) {}
static inline void kasan_disable_current(void) {}

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9503d59..41f6b22 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -26,6 +26,7 @@
* Thomas Gleixner, Mike Kravetz
*/

+#include <linux/kasan.h>
#include <linux/mm.h>
#include <linux/module.h>
#include <linux/nmi.h>
@@ -5096,6 +5097,8 @@ void init_idle(struct task_struct *idle, int cpu)
idle->state = TASK_RUNNING;
idle->se.exec_start = sched_clock();

+ kasan_unpoison_task_stack(idle);
+
#ifdef CONFIG_SMP
/*
* Its possible that init_idle() gets called multiple times on a task,
diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
index bc0a8d8..467f394 100644
--- a/mm/kasan/kasan.c
+++ b/mm/kasan/kasan.c
@@ -60,6 +60,16 @@ void kasan_unpoison_shadow(const void *address, size_t size)
}
}

+/*
+ * Remove any poison left on the stack from a prior hot-unplug.
+ */
+void kasan_unpoison_task_stack(struct task_struct *idle)
+{
+ void *base = task_stack_page(idle) + sizeof(struct thread_info);
+ size_t size = THREAD_SIZE - sizeof(struct thread_info);
+
+ kasan_unpoison_shadow(base, size);
+}

/*
* All functions below always inlined so compiler could
--
1.9.1


2016-03-01 20:06:52

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] sched/kasan: clear stale stack poison

Hi Mark,

[auto build test WARNING on tip/sched/core]
[also build test WARNING on v4.5-rc6 next-20160301]
[if your patch is applied to the wrong git tree, please drop us a note to help improving the system]

url: https://github.com/0day-ci/linux/commits/Mark-Rutland/sched-kasan-clear-stale-stack-poison/20160302-034556
config: sparc64-allyesconfig (attached as .config)
reproduce:
wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=sparc64

All warnings (new ones prefixed by >>):

In file included from kernel/sched/core.c:29:0:
include/linux/kasan.h:71:53: warning: 'struct task_struct' declared inside parameter list
static inline void kasan_unpoison_task_stack(struct task_struct *idle) {}
^
include/linux/kasan.h:71:53: warning: its scope is only this definition or declaration, which is probably not what you want
kernel/sched/core.c: In function 'init_idle':
>> kernel/sched/core.c:5026:2: warning: passing argument 1 of 'kasan_unpoison_task_stack' from incompatible pointer type
kasan_unpoison_task_stack(idle);
^
In file included from kernel/sched/core.c:29:0:
include/linux/kasan.h:71:20: note: expected 'struct task_struct *' but argument is of type 'struct task_struct *'
static inline void kasan_unpoison_task_stack(struct task_struct *idle) {}
^

vim +/kasan_unpoison_task_stack +5026 kernel/sched/core.c

5010 *
5011 * NOTE: this function does not set the idle thread's NEED_RESCHED
5012 * flag, to make booting more robust.
5013 */
5014 void init_idle(struct task_struct *idle, int cpu)
5015 {
5016 struct rq *rq = cpu_rq(cpu);
5017 unsigned long flags;
5018
5019 raw_spin_lock_irqsave(&idle->pi_lock, flags);
5020 raw_spin_lock(&rq->lock);
5021
5022 __sched_fork(0, idle);
5023 idle->state = TASK_RUNNING;
5024 idle->se.exec_start = sched_clock();
5025
> 5026 kasan_unpoison_task_stack(idle);
5027
5028 #ifdef CONFIG_SMP
5029 /*
5030 * Its possible that init_idle() gets called multiple times on a task,
5031 * in that case do_set_cpus_allowed() will not do the right thing.
5032 *
5033 * And since this is boot we can forgo the serialization.
5034 */

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (2.54 kB)
.config.gz (44.04 kB)
Download all attachments

2016-03-02 12:53:40

by Andrey Ryabinin

[permalink] [raw]
Subject: Re: [PATCH v1] kasan, arm64: Unpoison dirty stack frames when resuming from suspend.



On 03/01/2016 10:42 PM, Mark Rutland wrote:
> On Fri, Feb 26, 2016 at 06:28:27PM +0100, Alexander Potapenko wrote:
>> On Fri, Feb 26, 2016 at 2:53 PM, Mark Rutland <[email protected]> wrote:
>>> Hi,
>>>
>>> On Fri, Feb 26, 2016 at 01:38:37PM +0100, Alexander Potapenko wrote:
>>>> Before an ARM64 CPU is suspended, the kernel saves the context which will
>>>> be used to initialize the register state upon resume. After that and
>>>> before the actual execution of the SMC instruction the kernel creates
>>>> several stack frames which are never unpoisoned because arm_smccc_smc()
>>>> does not return. This may cause false positive stack buffer overflow
>>>> reports from KASAN.
>>>>
>>>> The solution is to record the stack pointer value just before the CPU is
>>>> suspended, and unpoison the part of stack between the saved value and
>>>> the stack pointer upon resume.
>>>
>>> Thanks for looking into this! That's much appreciated.
>>>
>>> I think the general approach (unposioning the stack upon cold return to
>>> the kernel) is fine, but I have concerns with the implementation, which
>>> I've noted below.
>
> For the idle case I intend to respin my patch [1] which calls
> kasan_unpoison_shadow from assembly in the resume path, as I think
> that's the only reliable approach.
>
>>> The problem also applies for hotplug, as leftover poison from the
>>> hot-unplug path isn't cleaned before a CPU is hotplugged back on. The
>>> first few functions are likely deterministic in their stack usage, so
>>> it's not seen with a defconfig, but I think it's possible to trigger,
>>> and it's also a cross-architecture problem shared with x86.
>> Agreed, but since I haven't yet seen problems with hotplug, it's hard
>> to test the fix for them.
>
> For the common hotplug case, how about the below?
>

Nah, looks a bit hacky IMO. I think it's better to use cpu hotplug notifier.
I'll send patch shortly.

> I've given it a spin locally on arm64 with the reproducer I posted
> earlier.
>
> Thanks,
> Mark.
>
> [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-February/409466.html
>
> ---->8----
> From 34839286826c88338cd91a142b1bcc3c077a87aa Mon Sep 17 00:00:00 2001
> From: Mark Rutland <[email protected]>
> Date: Tue, 1 Mar 2016 19:27:23 +0000
> Subject: [PATCH] sched/kasan: clear stale stack poison
>
> CPUs get hotplugged out some levels deep in C code, and hence when KASAN
> is in use, the instrumented function preambles will have left the stack
> shadow area poisoned.
>
> This poison is not cleared, so when a CPU re-enters the kernel, it is
> possible for accesses in instrumented functions to hit this stale
> poison, resulting in (spurious) KASAN splats.
>
> This patch forcefully unpoisons an idle task's stack shadow when it is
> re-initialised prior to a hotplug, avoiding spurious hits against stale
> poison.
>
> Signed-off-by: Mark Rutland <[email protected]>
> ---
> include/linux/kasan.h | 4 ++++
> kernel/sched/core.c | 3 +++
> mm/kasan/kasan.c | 10 ++++++++++
> 3 files changed, 17 insertions(+)
>
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index 4b9f85c..e00486f 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -43,6 +43,8 @@ static inline void kasan_disable_current(void)
>
> void kasan_unpoison_shadow(const void *address, size_t size);
>
> +void kasan_unpoison_task_stack(struct task_struct *idle);
> +
> void kasan_alloc_pages(struct page *page, unsigned int order);
> void kasan_free_pages(struct page *page, unsigned int order);
>
> @@ -66,6 +68,8 @@ void kasan_free_shadow(const struct vm_struct *vm);
>
> static inline void kasan_unpoison_shadow(const void *address, size_t size) {}
>
> +static inline void kasan_unpoison_task_stack(struct task_struct *idle) {}
> +
> static inline void kasan_enable_current(void) {}
> static inline void kasan_disable_current(void) {}
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 9503d59..41f6b22 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -26,6 +26,7 @@
> * Thomas Gleixner, Mike Kravetz
> */
>
> +#include <linux/kasan.h>
> #include <linux/mm.h>
> #include <linux/module.h>
> #include <linux/nmi.h>
> @@ -5096,6 +5097,8 @@ void init_idle(struct task_struct *idle, int cpu)
> idle->state = TASK_RUNNING;
> idle->se.exec_start = sched_clock();
>
> + kasan_unpoison_task_stack(idle);
> +
> #ifdef CONFIG_SMP
> /*
> * Its possible that init_idle() gets called multiple times on a task,
> diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
> index bc0a8d8..467f394 100644
> --- a/mm/kasan/kasan.c
> +++ b/mm/kasan/kasan.c
> @@ -60,6 +60,16 @@ void kasan_unpoison_shadow(const void *address, size_t size)
> }
> }
>
> +/*
> + * Remove any poison left on the stack from a prior hot-unplug.
> + */
> +void kasan_unpoison_task_stack(struct task_struct *idle)
> +{
> + void *base = task_stack_page(idle) + sizeof(struct thread_info);
> + size_t size = THREAD_SIZE - sizeof(struct thread_info);
> +
> + kasan_unpoison_shadow(base, size);
> +}
>
> /*
> * All functions below always inlined so compiler could
>

2016-03-02 13:52:41

by Andrey Ryabinin

[permalink] [raw]
Subject: [PATCH 2/2] kasan: unpoison stack of idle task on cpu online

KASAN poisons stack redzones on function's entrance and unpoisons prior
return. So when cpu goes offline the stack of idle task left poisoned.
When cpu goes back online it re-enters the kernel via another path and
starts using idle task's stack. Hence it's possible to hit stale poison
values which results in false-positive KASAN splats.

This patch registers cpu hotplug notifier which unpoisons idle task's
stack prior to onlining cpu.

Signed-off-by: Andrey Ryabinin <[email protected]>
---
include/linux/sched.h | 6 ++++++
kernel/smpboot.h | 2 --
mm/kasan/kasan.c | 33 +++++++++++++++++++++++++++------
3 files changed, 33 insertions(+), 8 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index a10494a..18e526d 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -337,6 +337,12 @@ extern asmlinkage void schedule_tail(struct task_struct *prev);
extern void init_idle(struct task_struct *idle, int cpu);
extern void init_idle_bootup_task(struct task_struct *idle);

+#ifdef CONFIG_GENERIC_SMP_IDLE_THREAD
+extern struct task_struct *idle_thread_get(unsigned int cpu);
+#else
+static inline struct task_struct *idle_thread_get(unsigned int cpu) { return NULL; }
+#endif
+
extern cpumask_var_t cpu_isolated_map;

extern int runqueue_is_locked(int cpu);
diff --git a/kernel/smpboot.h b/kernel/smpboot.h
index 72415a0..eebf9ec 100644
--- a/kernel/smpboot.h
+++ b/kernel/smpboot.h
@@ -4,11 +4,9 @@
struct task_struct;

#ifdef CONFIG_GENERIC_SMP_IDLE_THREAD
-struct task_struct *idle_thread_get(unsigned int cpu);
void idle_thread_set_boot_cpu(void);
void idle_threads_init(void);
#else
-static inline struct task_struct *idle_thread_get(unsigned int cpu) { return NULL; }
static inline void idle_thread_set_boot_cpu(void) { }
static inline void idle_threads_init(void) { }
#endif
diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
index bc0a8d8..c4ffd82 100644
--- a/mm/kasan/kasan.c
+++ b/mm/kasan/kasan.c
@@ -16,6 +16,7 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#define DISABLE_BRANCH_PROFILING

+#include <linux/cpu.h>
#include <linux/export.h>
#include <linux/init.h>
#include <linux/kernel.h>
@@ -537,16 +538,36 @@ static int kasan_mem_notifier(struct notifier_block *nb,
{
return (action == MEM_GOING_ONLINE) ? NOTIFY_BAD : NOTIFY_OK;
}
+#endif
+
+static int kasan_cpu_callback(struct notifier_block *nfb,
+ unsigned long action, void *hcpu)
+{
+ unsigned int cpu = (unsigned long)hcpu;
+
+ if ((action == CPU_UP_PREPARE) || (action == CPU_UP_PREPARE_FROZEN)) {
+ struct task_struct *tidle = idle_thread_get(cpu);
+ kasan_unpoison_shadow(task_stack_page(tidle), THREAD_SIZE);
+ }
+ return NOTIFY_OK;
+}

-static int __init kasan_memhotplug_init(void)
+static struct notifier_block kasan_cpu_notifier =
{
- pr_err("WARNING: KASAN doesn't support memory hot-add\n");
- pr_err("Memory hot-add will be disabled\n");
+ .notifier_call = kasan_cpu_callback,
+};

- hotplug_memory_notifier(kasan_mem_notifier, 0);
+static int __init kasan_notifiers_init(void)
+{
+ if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG)) {
+ pr_err("WARNING: KASAN doesn't support memory hot-add\n");
+ pr_err("Memory hot-add will be disabled\n");
+ hotplug_memory_notifier(kasan_mem_notifier, 0);
+ }
+
+ register_hotcpu_notifier(&kasan_cpu_notifier);

return 0;
}

-module_init(kasan_memhotplug_init);
-#endif
+module_init(kasan_notifiers_init);
--
2.4.10

2016-03-02 13:52:44

by Andrey Ryabinin

[permalink] [raw]
Subject: [PATCH 1/2] cpu, idle: move init_idle() out of idle_thread_get()

Currently idle_thread_get() not only gets the idle task but also
initializes it. This is fine for _cpu_up() which is the only caller
of idle_thread_get().

idle_thread_get() is going to be used in the next patch, but it
would be nice to avoid double initialization of idle task.
Hence this patch removes init_idle() call from idle_thread_get()
and invokes it directly from _cpu_up().

No functional changes here.

Signed-off-by: Andrey Ryabinin <[email protected]>
---
kernel/cpu.c | 2 ++
kernel/smpboot.c | 1 -
2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 5b9d396..63d8e7b 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -503,6 +503,8 @@ static int _cpu_up(unsigned int cpu, int tasks_frozen)
goto out;
}

+ init_idle(idle, cpu);
+
ret = smpboot_create_threads(cpu);
if (ret)
goto out;
diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index d264f59..74687e8 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -31,7 +31,6 @@ struct task_struct *idle_thread_get(unsigned int cpu)

if (!tsk)
return ERR_PTR(-ENOMEM);
- init_idle(tsk, cpu);
return tsk;
}

--
2.4.10

2016-03-02 14:50:16

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH 2/2] kasan: unpoison stack of idle task on cpu online

Hi,

On Wed, Mar 02, 2016 at 04:51:59PM +0300, Andrey Ryabinin wrote:
> KASAN poisons stack redzones on function's entrance and unpoisons prior
> return. So when cpu goes offline the stack of idle task left poisoned.
> When cpu goes back online it re-enters the kernel via another path and
> starts using idle task's stack. Hence it's possible to hit stale poison
> values which results in false-positive KASAN splats.
>
> This patch registers cpu hotplug notifier which unpoisons idle task's
> stack prior to onlining cpu.

Sorry, I failed to spot this before sending my series just now.

FWIW, I have no strong feelings either way as to how we hook up the
stack shadow clearing in the hotplug case.

It would be good if we could organise to share the infrastructure for
idle, though.

Otherwise, I have a couple of comments below.

> Signed-off-by: Andrey Ryabinin <[email protected]>
> ---
> include/linux/sched.h | 6 ++++++
> kernel/smpboot.h | 2 --
> mm/kasan/kasan.c | 33 +++++++++++++++++++++++++++------
> 3 files changed, 33 insertions(+), 8 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index a10494a..18e526d 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -337,6 +337,12 @@ extern asmlinkage void schedule_tail(struct task_struct *prev);
> extern void init_idle(struct task_struct *idle, int cpu);
> extern void init_idle_bootup_task(struct task_struct *idle);
>
> +#ifdef CONFIG_GENERIC_SMP_IDLE_THREAD
> +extern struct task_struct *idle_thread_get(unsigned int cpu);
> +#else
> +static inline struct task_struct *idle_thread_get(unsigned int cpu) { return NULL; }
> +#endif
> +
> extern cpumask_var_t cpu_isolated_map;
>
> extern int runqueue_is_locked(int cpu);
> diff --git a/kernel/smpboot.h b/kernel/smpboot.h
> index 72415a0..eebf9ec 100644
> --- a/kernel/smpboot.h
> +++ b/kernel/smpboot.h
> @@ -4,11 +4,9 @@
> struct task_struct;
>
> #ifdef CONFIG_GENERIC_SMP_IDLE_THREAD
> -struct task_struct *idle_thread_get(unsigned int cpu);
> void idle_thread_set_boot_cpu(void);
> void idle_threads_init(void);
> #else
> -static inline struct task_struct *idle_thread_get(unsigned int cpu) { return NULL; }
> static inline void idle_thread_set_boot_cpu(void) { }
> static inline void idle_threads_init(void) { }
> #endif

Is all the above necessary?

Surely we can just include <linux/smpboot.h> in mm/kasan/kasan.c?

> diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
> index bc0a8d8..c4ffd82 100644
> --- a/mm/kasan/kasan.c
> +++ b/mm/kasan/kasan.c
> @@ -16,6 +16,7 @@
> #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> #define DISABLE_BRANCH_PROFILING
>
> +#include <linux/cpu.h>
> #include <linux/export.h>
> #include <linux/init.h>
> #include <linux/kernel.h>
> @@ -537,16 +538,36 @@ static int kasan_mem_notifier(struct notifier_block *nb,
> {
> return (action == MEM_GOING_ONLINE) ? NOTIFY_BAD : NOTIFY_OK;
> }
> +#endif
> +
> +static int kasan_cpu_callback(struct notifier_block *nfb,
> + unsigned long action, void *hcpu)
> +{
> + unsigned int cpu = (unsigned long)hcpu;
> +
> + if ((action == CPU_UP_PREPARE) || (action == CPU_UP_PREPARE_FROZEN)) {
> + struct task_struct *tidle = idle_thread_get(cpu);
> + kasan_unpoison_shadow(task_stack_page(tidle), THREAD_SIZE);

We never expect the stack to hit the end of the thread_info, so we can
start at task_stack_page(tidle) + 1, and avoid the shadow for
sizeof(struct thread_info).

Do we do any poisoning of the thread_info structure in the thread_union?
If so, we'd be erroneously clearing it here.

Thanks,
Mark.

2016-03-02 15:28:03

by Andrey Ryabinin

[permalink] [raw]
Subject: Re: [PATCH 2/2] kasan: unpoison stack of idle task on cpu online



On 03/02/2016 05:50 PM, Mark Rutland wrote:
> Hi,
>
> On Wed, Mar 02, 2016 at 04:51:59PM +0300, Andrey Ryabinin wrote:
>> KASAN poisons stack redzones on function's entrance and unpoisons prior
>> return. So when cpu goes offline the stack of idle task left poisoned.
>> When cpu goes back online it re-enters the kernel via another path and
>> starts using idle task's stack. Hence it's possible to hit stale poison
>> values which results in false-positive KASAN splats.
>>
>> This patch registers cpu hotplug notifier which unpoisons idle task's
>> stack prior to onlining cpu.
>
> Sorry, I failed to spot this before sending my series just now.
>
> FWIW, I have no strong feelings either way as to how we hook up the
> stack shadow clearing in the hotplug case.
>

In fact, I'm also don't have strong opinion on this.

Ingo, Peter, what's your preference?
These patches or http://lkml.kernel.org/g/<[email protected]> ?

> It would be good if we could organise to share the infrastructure for
> idle, though.
>
> Otherwise, I have a couple of comments below.
>
>> Signed-off-by: Andrey Ryabinin <[email protected]>
>> ---
>> include/linux/sched.h | 6 ++++++
>> kernel/smpboot.h | 2 --
>> mm/kasan/kasan.c | 33 +++++++++++++++++++++++++++------
>> 3 files changed, 33 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index a10494a..18e526d 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -337,6 +337,12 @@ extern asmlinkage void schedule_tail(struct task_struct *prev);
>> extern void init_idle(struct task_struct *idle, int cpu);
>> extern void init_idle_bootup_task(struct task_struct *idle);
>>
>> +#ifdef CONFIG_GENERIC_SMP_IDLE_THREAD
>> +extern struct task_struct *idle_thread_get(unsigned int cpu);
>> +#else
>> +static inline struct task_struct *idle_thread_get(unsigned int cpu) { return NULL; }
>> +#endif
>> +
>> extern cpumask_var_t cpu_isolated_map;
>>
>> extern int runqueue_is_locked(int cpu);
>> diff --git a/kernel/smpboot.h b/kernel/smpboot.h
>> index 72415a0..eebf9ec 100644
>> --- a/kernel/smpboot.h
>> +++ b/kernel/smpboot.h
>> @@ -4,11 +4,9 @@
>> struct task_struct;
>>
>> #ifdef CONFIG_GENERIC_SMP_IDLE_THREAD
>> -struct task_struct *idle_thread_get(unsigned int cpu);
>> void idle_thread_set_boot_cpu(void);
>> void idle_threads_init(void);
>> #else
>> -static inline struct task_struct *idle_thread_get(unsigned int cpu) { return NULL; }
>> static inline void idle_thread_set_boot_cpu(void) { }
>> static inline void idle_threads_init(void) { }
>> #endif
>
> Is all the above necessary?
>
> Surely we can just include <linux/smpboot.h> in mm/kasan/kasan.c?
>

It is necessary. kernel/smpboot.h != include/linux/smpboot.h


>> diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
>> index bc0a8d8..c4ffd82 100644
>> --- a/mm/kasan/kasan.c
>> +++ b/mm/kasan/kasan.c
>> @@ -16,6 +16,7 @@
>> #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>> #define DISABLE_BRANCH_PROFILING
>>
>> +#include <linux/cpu.h>
>> #include <linux/export.h>
>> #include <linux/init.h>
>> #include <linux/kernel.h>
>> @@ -537,16 +538,36 @@ static int kasan_mem_notifier(struct notifier_block *nb,
>> {
>> return (action == MEM_GOING_ONLINE) ? NOTIFY_BAD : NOTIFY_OK;
>> }
>> +#endif
>> +
>> +static int kasan_cpu_callback(struct notifier_block *nfb,
>> + unsigned long action, void *hcpu)
>> +{
>> + unsigned int cpu = (unsigned long)hcpu;
>> +
>> + if ((action == CPU_UP_PREPARE) || (action == CPU_UP_PREPARE_FROZEN)) {
>> + struct task_struct *tidle = idle_thread_get(cpu);
>> + kasan_unpoison_shadow(task_stack_page(tidle), THREAD_SIZE);
>
> We never expect the stack to hit the end of the thread_info, so we can
> start at task_stack_page(tidle) + 1, and avoid the shadow for
> sizeof(struct thread_info).
>

I wouldn't bother, it's simpler to unpoison all. Size of struct thread_info is 32-bytes. That's 4-bytes of shadow.
I don't think it matters whether you do memset of 2048 or 2044 bytes.

> Do we do any poisoning of the thread_info structure in the thread_union?

No, why would we poison it? It's absolutely valid memory and available for access.

> If so, we'd be erroneously clearing it here.
>
> Thanks,
> Mark.
>

2016-03-02 15:44:09

by Mark Rutland

[permalink] [raw]
Subject: Re: [PATCH 2/2] kasan: unpoison stack of idle task on cpu online

On Wed, Mar 02, 2016 at 06:27:49PM +0300, Andrey Ryabinin wrote:
> On 03/02/2016 05:50 PM, Mark Rutland wrote:
> > On Wed, Mar 02, 2016 at 04:51:59PM +0300, Andrey Ryabinin wrote:

[...]

> > Is all the above necessary?
> >
> > Surely we can just include <linux/smpboot.h> in mm/kasan/kasan.c?
>
> It is necessary. kernel/smpboot.h != include/linux/smpboot.h

Ah, I'd misread the patch. Sorry for the noise!

[...]

> >> + struct task_struct *tidle = idle_thread_get(cpu);
> >> + kasan_unpoison_shadow(task_stack_page(tidle), THREAD_SIZE);
> >
> > We never expect the stack to hit the end of the thread_info, so we can
> > start at task_stack_page(tidle) + 1, and avoid the shadow for
> > sizeof(struct thread_info).
> >
>
> I wouldn't bother, it's simpler to unpoison all. Size of struct thread_info is 32-bytes. That's 4-bytes of shadow.
> I don't think it matters whether you do memset of 2048 or 2044 bytes.
>
> > Do we do any poisoning of the thread_info structure in the thread_union?
>
> No, why would we poison it? It's absolutely valid memory and available for access.

For some reason I thought ASAN might poison gaps between struct
elements, or at least held open the option to. I guess inserting padding
would be an ABI issue, so it probably doesn't.

In the absence of that, I agree that always starting at
task_stack_page(t), and clearing the shadow for THREAD_SIZE bytes of
stack makes sense).

Thanks,
Mark.