There is a window for racing when printing directly to task->comm,
allowing other threads to see a non-terminated string. The vsnprintf
function fills the buffer, counts the truncated chars, then finally
writes the \0 at the end.
creator other
vsnprintf:
fill (not terminated)
count the rest trace_sched_waking(p):
... memcpy(comm, p->comm, TASK_COMM_LEN)
write \0
The consequences depend on how 'other' uses the string. In our case,
it was copied into the tracing system's saved cmdlines, a buffer of
adjacent TASK_COMM_LEN-byte buffers (note the 'n' where 0 should be):
crash-arm64> x/1024s savedcmd->saved_cmdlines | grep 'evenk'
0xffffffd5b3818640: "irq/497-pwr_evenkworker/u16:12"
...and a strcpy out of there would cause stack corruption:
[224761.522292] Kernel panic - not syncing: stack-protector:
Kernel stack is corrupted in: ffffff9bf9783c78
crash-arm64> kbt | grep 'comm\|trace_print_context'
#6 0xffffff9bf9783c78 in trace_print_context+0x18c(+396)
comm (char [16]) = "irq/497-pwr_even"
crash-arm64> rd 0xffffffd4d0e17d14 8
ffffffd4d0e17d14: 2f71726900000000 5f7277702d373934 ....irq/497-pwr_
ffffffd4d0e17d24: 726f776b6e657665 3a3631752f72656b evenkworker/u16:
ffffffd4d0e17d34: f9780248ff003231 cede60e0ffffff9b 12..H.x......`..
ffffffd4d0e17d44: cede60c8ffffffd4 00000fffffffffd4 .....`..........
The workaround in e09e28671 (use strlcpy in __trace_find_cmdline) was
likely needed because of this same bug.
Solved by vsnprintf:ing to a local buffer, then using set_task_comm().
This way, there won't be a window where comm is not terminated.
Signed-off-by: Snild Dolkow <[email protected]>
---
kernel/kthread.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 481951bf091d..1a481ae12dec 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -319,8 +319,14 @@ struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data),
task = create->result;
if (!IS_ERR(task)) {
static const struct sched_param param = { .sched_priority = 0 };
+ char name[TASK_COMM_LEN];
- vsnprintf(task->comm, sizeof(task->comm), namefmt, args);
+ /*
+ * task is already visible to other tasks, so updating
+ * COMM must be protected.
+ */
+ vsnprintf(name, sizeof(name), namefmt, args);
+ set_task_comm(task, name);
/*
* root may have changed our (kthreadd's) priority or CPU mask.
* The kernel thread should not inherit these properties.
--
2.15.1
On Tue, 24 Jul 2018 17:12:13 +0200
Snild Dolkow <[email protected]> wrote:
> There is a window for racing when printing directly to task->comm,
> allowing other threads to see a non-terminated string. The vsnprintf
> function fills the buffer, counts the truncated chars, then finally
> writes the \0 at the end.
>
> creator other
> vsnprintf:
> fill (not terminated)
> count the rest trace_sched_waking(p):
> ... memcpy(comm, p->comm, TASK_COMM_LEN)
> write \0
>
> The consequences depend on how 'other' uses the string. In our case,
> it was copied into the tracing system's saved cmdlines, a buffer of
> adjacent TASK_COMM_LEN-byte buffers (note the 'n' where 0 should be):
>
> crash-arm64> x/1024s savedcmd->saved_cmdlines | grep 'evenk'
> 0xffffffd5b3818640: "irq/497-pwr_evenkworker/u16:12"
>
> ...and a strcpy out of there would cause stack corruption:
>
> [224761.522292] Kernel panic - not syncing: stack-protector:
> Kernel stack is corrupted in: ffffff9bf9783c78
>
> crash-arm64> kbt | grep 'comm\|trace_print_context'
> #6 0xffffff9bf9783c78 in trace_print_context+0x18c(+396)
> comm (char [16]) = "irq/497-pwr_even"
>
> crash-arm64> rd 0xffffffd4d0e17d14 8
> ffffffd4d0e17d14: 2f71726900000000 5f7277702d373934 ....irq/497-pwr_
> ffffffd4d0e17d24: 726f776b6e657665 3a3631752f72656b evenkworker/u16:
> ffffffd4d0e17d34: f9780248ff003231 cede60e0ffffff9b 12..H.x......`..
> ffffffd4d0e17d44: cede60c8ffffffd4 00000fffffffffd4 .....`..........
>
> The workaround in e09e28671 (use strlcpy in __trace_find_cmdline) was
> likely needed because of this same bug.
>
> Solved by vsnprintf:ing to a local buffer, then using set_task_comm().
> This way, there won't be a window where comm is not terminated.
>
Should add:
Cc: [email protected]
And it is a bug from beginning of git. But it wasn't really until
tracing came along that can trigger it. Thus:
Fixes: bc0c38d139ec7 ("ftrace: latency tracer infrastructure")
> Signed-off-by: Snild Dolkow <[email protected]>
Reviewed-by: Steven Rostedt (VMware) <[email protected]>
Now the question is, who will take this in their tree?
I can, unless someone else wants it. But I wont without another
Acked-by.
-- Steve
> ---
> kernel/kthread.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index 481951bf091d..1a481ae12dec 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -319,8 +319,14 @@ struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data),
> task = create->result;
> if (!IS_ERR(task)) {
> static const struct sched_param param = { .sched_priority = 0 };
> + char name[TASK_COMM_LEN];
>
> - vsnprintf(task->comm, sizeof(task->comm), namefmt, args);
> + /*
> + * task is already visible to other tasks, so updating
> + * COMM must be protected.
> + */
> + vsnprintf(name, sizeof(name), namefmt, args);
> + set_task_comm(task, name);
> /*
> * root may have changed our (kthreadd's) priority or CPU mask.
> * The kernel thread should not inherit these properties.
There is a window for racing when printing directly to task->comm,
allowing other threads to see a non-terminated string. The vsnprintf
function fills the buffer, counts the truncated chars, then finally
writes the \0 at the end.
creator other
vsnprintf:
fill (not terminated)
count the rest trace_sched_waking(p):
... memcpy(comm, p->comm, TASK_COMM_LEN)
write \0
The consequences depend on how 'other' uses the string. In our case,
it was copied into the tracing system's saved cmdlines, a buffer of
adjacent TASK_COMM_LEN-byte buffers (note the 'n' where 0 should be):
crash-arm64> x/1024s savedcmd->saved_cmdlines | grep 'evenk'
0xffffffd5b3818640: "irq/497-pwr_evenkworker/u16:12"
...and a strcpy out of there would cause stack corruption:
[224761.522292] Kernel panic - not syncing: stack-protector:
Kernel stack is corrupted in: ffffff9bf9783c78
crash-arm64> kbt | grep 'comm\|trace_print_context'
#6 0xffffff9bf9783c78 in trace_print_context+0x18c(+396)
comm (char [16]) = "irq/497-pwr_even"
crash-arm64> rd 0xffffffd4d0e17d14 8
ffffffd4d0e17d14: 2f71726900000000 5f7277702d373934 ....irq/497-pwr_
ffffffd4d0e17d24: 726f776b6e657665 3a3631752f72656b evenkworker/u16:
ffffffd4d0e17d34: f9780248ff003231 cede60e0ffffff9b 12..H.x......`..
ffffffd4d0e17d44: cede60c8ffffffd4 00000fffffffffd4 .....`..........
The workaround in e09e28671 (use strlcpy in __trace_find_cmdline) was
likely needed because of this same bug.
Solved by vsnprintf:ing to a local buffer, then using set_task_comm().
This way, there won't be a window where comm is not terminated.
Cc: [email protected]
Fixes: bc0c38d139ec7 ("ftrace: latency tracer infrastructure")
Reviewed-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Snild Dolkow <[email protected]>
---
kernel/kthread.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 481951bf091d..1a481ae12dec 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -319,8 +319,14 @@ struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data),
task = create->result;
if (!IS_ERR(task)) {
static const struct sched_param param = { .sched_priority = 0 };
+ char name[TASK_COMM_LEN];
- vsnprintf(task->comm, sizeof(task->comm), namefmt, args);
+ /*
+ * task is already visible to other tasks, so updating
+ * COMM must be protected.
+ */
+ vsnprintf(name, sizeof(name), namefmt, args);
+ set_task_comm(task, name);
/*
* root may have changed our (kthreadd's) priority or CPU mask.
* The kernel thread should not inherit these properties.
--
2.15.1
Thanks for sending this.
Unless someone else has an issue with this, I'll just take it in my
tree. I'm currently running a bunch of commits through my tests for
this current rc cycle. I'll add this to the bunch.
I still would like to have another ack from somebody.
-- Steve
On Thu, 26 Jul 2018 09:15:39 +0200
Snild Dolkow <[email protected]> wrote:
> There is a window for racing when printing directly to task->comm,
> allowing other threads to see a non-terminated string. The vsnprintf
> function fills the buffer, counts the truncated chars, then finally
> writes the \0 at the end.
>
> creator other
> vsnprintf:
> fill (not terminated)
> count the rest trace_sched_waking(p):
> ... memcpy(comm, p->comm, TASK_COMM_LEN)
> write \0
>
> The consequences depend on how 'other' uses the string. In our case,
> it was copied into the tracing system's saved cmdlines, a buffer of
> adjacent TASK_COMM_LEN-byte buffers (note the 'n' where 0 should be):
>
> crash-arm64> x/1024s savedcmd->saved_cmdlines | grep 'evenk'
> 0xffffffd5b3818640: "irq/497-pwr_evenkworker/u16:12"
>
> ...and a strcpy out of there would cause stack corruption:
>
> [224761.522292] Kernel panic - not syncing: stack-protector:
> Kernel stack is corrupted in: ffffff9bf9783c78
>
> crash-arm64> kbt | grep 'comm\|trace_print_context'
> #6 0xffffff9bf9783c78 in trace_print_context+0x18c(+396)
> comm (char [16]) = "irq/497-pwr_even"
>
> crash-arm64> rd 0xffffffd4d0e17d14 8
> ffffffd4d0e17d14: 2f71726900000000 5f7277702d373934 ....irq/497-pwr_
> ffffffd4d0e17d24: 726f776b6e657665 3a3631752f72656b evenkworker/u16:
> ffffffd4d0e17d34: f9780248ff003231 cede60e0ffffff9b 12..H.x......`..
> ffffffd4d0e17d44: cede60c8ffffffd4 00000fffffffffd4 .....`..........
>
> The workaround in e09e28671 (use strlcpy in __trace_find_cmdline) was
> likely needed because of this same bug.
>
> Solved by vsnprintf:ing to a local buffer, then using set_task_comm().
> This way, there won't be a window where comm is not terminated.
>
> Cc: [email protected]
> Fixes: bc0c38d139ec7 ("ftrace: latency tracer infrastructure")
> Reviewed-by: Steven Rostedt (VMware) <[email protected]>
> Signed-off-by: Snild Dolkow <[email protected]>
> ---
> kernel/kthread.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index 481951bf091d..1a481ae12dec 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -319,8 +319,14 @@ struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data),
> task = create->result;
> if (!IS_ERR(task)) {
> static const struct sched_param param = { .sched_priority = 0 };
> + char name[TASK_COMM_LEN];
>
> - vsnprintf(task->comm, sizeof(task->comm), namefmt, args);
> + /*
> + * task is already visible to other tasks, so updating
> + * COMM must be protected.
> + */
> + vsnprintf(name, sizeof(name), namefmt, args);
> + set_task_comm(task, name);
> /*
> * root may have changed our (kthreadd's) priority or CPU mask.
> * The kernel thread should not inherit these properties.