This patchset makes it possible to use kdb's ftdump command without
panicing, crashing or livelocking. The ftdump command cannot be used
at all without these changes.
IIRC this patches are still pending Jason's ack.
Changes since v1:
* Fixed use-after-free problems in v1 by adding logic to NULL out
buffer_iter during free operations.
* Introduced a second patch to fix a live lock when trying to display
an empty trace buffer.
Daniel Thompson (2):
trace: kdb: Fix kernel panic during ftdump
trace: kdb: Fix kernel livelock with empty buffers
kernel/trace/trace_kdb.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
--
1.9.3
Currently kdb's ftdump command unconditionally crashes due to a null
pointer de-reference whenever the command is run. This in turn causes
the kernel to panic.
The abridged stacktrace (gathered with ARCH=arm) is:
--- cut here ---
[<c09535ac>] (panic) from [<c02132dc>] (die+0x264/0x440)
[<c02132dc>] (die) from [<c0952eb8>]
(__do_kernel_fault.part.11+0x74/0x84)
[<c0952eb8>] (__do_kernel_fault.part.11) from [<c021f954>]
(do_page_fault+0x1d0/0x3c4)
[<c021f954>] (do_page_fault) from [<c020846c>] (do_DataAbort+0x48/0xac)
[<c020846c>] (do_DataAbort) from [<c0213c58>] (__dabt_svc+0x38/0x60)
Exception stack(0xc0deba88 to 0xc0debad0)
ba80: e8c29180 00000001 e9854304 e9854300 c0f567d8
c0df2580
baa0: 00000000 00000000 00000000 c0f117b8 c0e3a3c0 c0debb0c 00000000
c0debad0
bac0: 0000672e c02f4d60 60000193 ffffffff
[<c0213c58>] (__dabt_svc) from [<c02f4d60>] (kdb_ftdump+0x1e4/0x3d8)
[<c02f4d60>] (kdb_ftdump) from [<c02ce328>] (kdb_parse+0x2b8/0x698)
[<c02ce328>] (kdb_parse) from [<c02ceef0>] (kdb_main_loop+0x52c/0x784)
[<c02ceef0>] (kdb_main_loop) from [<c02d1b0c>] (kdb_stub+0x238/0x490)
--- cut here ---
The NULL deref occurs due to the initialized use of struct trace_iter's
buffer_iter member.
This is a regression, albeit a fairly elderly one. It was introduced
by commit 6d158a813efc ("tracing: Remove NR_CPUS array from
trace_iterator").
This patch solves this by providing a collection of ring_buffer_iter(s)
and using this to initialize buffer_iter. Note that static allocation
is used solely because the trace_iter itself is also static allocated.
Static allocation also means that we have to NULL-ify the pointer during
cleanup to avoid use-after-free problems.
Signed-off-by: Daniel Thompson <[email protected]>
Cc: Jason Wessel <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
---
kernel/trace/trace_kdb.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_kdb.c b/kernel/trace/trace_kdb.c
index bd90e1b..8faa7ce 100644
--- a/kernel/trace/trace_kdb.c
+++ b/kernel/trace/trace_kdb.c
@@ -20,10 +20,12 @@ static void ftrace_dump_buf(int skip_lines, long cpu_file)
{
/* use static because iter can be a bit big for the stack */
static struct trace_iterator iter;
+ static struct ring_buffer_iter *buffer_iter[CONFIG_NR_CPUS];
unsigned int old_userobj;
int cnt = 0, cpu;
trace_init_global_iter(&iter);
+ iter.buffer_iter = buffer_iter;
for_each_tracing_cpu(cpu) {
atomic_inc(&per_cpu_ptr(iter.trace_buffer->data, cpu)->disabled);
@@ -86,9 +88,12 @@ out:
atomic_dec(&per_cpu_ptr(iter.trace_buffer->data, cpu)->disabled);
}
- for_each_tracing_cpu(cpu)
- if (iter.buffer_iter[cpu])
+ for_each_tracing_cpu(cpu) {
+ if (iter.buffer_iter[cpu]) {
ring_buffer_read_finish(iter.buffer_iter[cpu]);
+ iter.buffer_iter[cpu] = NULL;
+ }
+ }
}
/*
--
1.9.3
Currently kdb's ftdump command will livelock by constantly printk'ing
the empty string at KERN_EMERG level if it run when the ftrace system is
not in use. This occurs because trace_empty() never returns false when
the ring buffers are left at the start of a non-consuming read [launched
by ring_buffer_read_start()].
This patch changes the loop exit condition to use the result of
trace_find_next_entry_inc(). Effectively this switches the non-consuming
kdb dumper to follow the approach of the non-consuming userspace
interface [s_next()] rather than the consuming ftrace_dump().
Signed-off-by: Daniel Thompson <[email protected]>
Cc: Jason Wessel <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
---
kernel/trace/trace_kdb.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/kernel/trace/trace_kdb.c b/kernel/trace/trace_kdb.c
index 8faa7ce..b0b1c44 100644
--- a/kernel/trace/trace_kdb.c
+++ b/kernel/trace/trace_kdb.c
@@ -59,19 +59,19 @@ static void ftrace_dump_buf(int skip_lines, long cpu_file)
ring_buffer_read_start(iter.buffer_iter[cpu_file]);
tracing_iter_reset(&iter, cpu_file);
}
- if (!trace_empty(&iter))
- trace_find_next_entry_inc(&iter);
- while (!trace_empty(&iter)) {
+
+ while (trace_find_next_entry_inc(&iter)) {
if (!cnt)
kdb_printf("---------------------------------\n");
cnt++;
- if (trace_find_next_entry_inc(&iter) != NULL && !skip_lines)
+ if (!skip_lines) {
print_trace_line(&iter);
- if (!skip_lines)
trace_printk_seq(&iter.seq);
- else
+ } else {
skip_lines--;
+ }
+
if (KDB_FLAG(CMD_INTERRUPT))
goto out;
}
--
1.9.3
On Thu, 6 Nov 2014 12:41:54 +0000
Daniel Thompson <[email protected]> wrote:
> This patchset makes it possible to use kdb's ftdump command without
> panicing, crashing or livelocking. The ftdump command cannot be used
> at all without these changes.
>
> IIRC this patches are still pending Jason's ack.
I haven't heard from Jason in a long time. Is he still active?
-- Steve
>
> Changes since v1:
>
> * Fixed use-after-free problems in v1 by adding logic to NULL out
> buffer_iter during free operations.
>
> * Introduced a second patch to fix a live lock when trying to display
> an empty trace buffer.
>
>
> Daniel Thompson (2):
> trace: kdb: Fix kernel panic during ftdump
> trace: kdb: Fix kernel livelock with empty buffers
>
> kernel/trace/trace_kdb.c | 21 +++++++++++++--------
> 1 file changed, 13 insertions(+), 8 deletions(-)
>
> --
> 1.9.3
On 06/11/14 13:27, Steven Rostedt wrote:
> On Thu, 6 Nov 2014 12:41:54 +0000
> Daniel Thompson <[email protected]> wrote:
>
>> This patchset makes it possible to use kdb's ftdump command without
>> panicing, crashing or livelocking. The ftdump command cannot be used
>> at all without these changes.
>>
>> IIRC this patches are still pending Jason's ack.
>
> I haven't heard from Jason in a long time. Is he still active?
[sorry for the delay, I wanted to give Jason a chance to answer this]
Very occasionally.
I can't find anything on lkml in the last three months, and I have
unreviewed kdb patches that stretch back well beyond that.
That said he still helps people on kgdb-bugreport@ from time-to-time
(and as recently as last week). I've also had a little bit of private
contact although nothing very recent.
On that basis I'd say you shouldn't feel guilty if you have to accept a
change here without an ack.
Daniel.
On Mon, 10 Nov 2014 09:41:32 +0000
Daniel Thompson <[email protected]> wrote:
> On 06/11/14 13:27, Steven Rostedt wrote:
> > On Thu, 6 Nov 2014 12:41:54 +0000
> > Daniel Thompson <[email protected]> wrote:
> >
> >> This patchset makes it possible to use kdb's ftdump command without
> >> panicing, crashing or livelocking. The ftdump command cannot be used
> >> at all without these changes.
> >>
> >> IIRC this patches are still pending Jason's ack.
> >
> > I haven't heard from Jason in a long time. Is he still active?
>
> [sorry for the delay, I wanted to give Jason a chance to answer this]
>
> Very occasionally.
>
> I can't find anything on lkml in the last three months, and I have
> unreviewed kdb patches that stretch back well beyond that.
>
> That said he still helps people on kgdb-bugreport@ from time-to-time
> (and as recently as last week). I've also had a little bit of private
> contact although nothing very recent.
>
> On that basis I'd say you shouldn't feel guilty if you have to accept a
> change here without an ack.
>
He had more than enough time to respond. OK, I'll take it.
Looking at the first patch, I notice that there's no protection of the
static buffer_iter array. I also noticed that there's no protection of
the static iter itself (which was there before your patch). I take it
that this code is not re-entrant.
Thanks!
-- Steve
On Thu, 6 Nov 2014 12:41:55 +0000
Daniel Thompson <[email protected]> wrote:
> Currently kdb's ftdump command unconditionally crashes due to a null
> pointer de-reference whenever the command is run. This in turn causes
> the kernel to panic.
>
> The abridged stacktrace (gathered with ARCH=arm) is:
> --- cut here ---
This is a nasty line. "git am " truncated the change log there thinking
it was the end of the change log (that '---' did it).
-- Steve
> [<c09535ac>] (panic) from [<c02132dc>] (die+0x264/0x440)
> [<c02132dc>] (die) from [<c0952eb8>]
> (__do_kernel_fault.part.11+0x74/0x84)
> [<c0952eb8>] (__do_kernel_fault.part.11) from [<c021f954>]
> (do_page_fault+0x1d0/0x3c4)
On 14/11/14 02:26, Steven Rostedt wrote:
> On Thu, 6 Nov 2014 12:41:55 +0000
> Daniel Thompson <[email protected]> wrote:
>
>> Currently kdb's ftdump command unconditionally crashes due to a null
>> pointer de-reference whenever the command is run. This in turn causes
>> the kernel to panic.
>>
>> The abridged stacktrace (gathered with ARCH=arm) is:
>> --- cut here ---
>
> This is a nasty line. "git am " truncated the change log there thinking
> it was the end of the change log (that '---' did it).
Oops. Looks like I shall have to train myself out of that particular
habit (which will be tough, since I've used it long enough to be ingrained).
Will repost as soon as I've thought about your other comments.
Daniel.
On 14/11/14 02:16, Steven Rostedt wrote:
> On Mon, 10 Nov 2014 09:41:32 +0000
> Daniel Thompson <[email protected]> wrote:
>
>> On 06/11/14 13:27, Steven Rostedt wrote:
>>> On Thu, 6 Nov 2014 12:41:54 +0000
>>> Daniel Thompson <[email protected]> wrote:
>>>
>>>> This patchset makes it possible to use kdb's ftdump command without
>>>> panicing, crashing or livelocking. The ftdump command cannot be used
>>>> at all without these changes.
>>>>
>>>> IIRC this patches are still pending Jason's ack.
>>>
>>> I haven't heard from Jason in a long time. Is he still active?
>>
>> [sorry for the delay, I wanted to give Jason a chance to answer this]
>>
>> Very occasionally.
>>
>> I can't find anything on lkml in the last three months, and I have
>> unreviewed kdb patches that stretch back well beyond that.
>>
>> That said he still helps people on kgdb-bugreport@ from time-to-time
>> (and as recently as last week). I've also had a little bit of private
>> contact although nothing very recent.
>>
>> On that basis I'd say you shouldn't feel guilty if you have to accept a
>> change here without an ack.
>>
>
> He had more than enough time to respond. OK, I'll take it.
>
> Looking at the first patch, I notice that there's no protection of the
> static buffer_iter array. I also noticed that there's no protection of
> the static iter itself (which was there before your patch). I take it
> that this code is not re-entrant.
No.
k(g)db halts all other processors before entering the command dispatch
loop and it forces a kernel panic if the debugger is reentered by the
same CPU.
Daniel.
This patchset makes it possible to use kdb's ftdump command without
panicing, crashing or livelocking. The ftdump command cannot be used
at all without these changes.
IIRC this patches are still pending Jason's ack.
v3:
* Remove lines leading with --- from the patch comments, now it it
-~- cut here -~- and "git am" did the right thing when I checked
(Steven Rostedt)
v2:
* Fixed use-after-free problems in v1 by adding logic to NULL out
buffer_iter during free operations.
* Introduced a second patch to fix a live lock when trying to display
an empty trace buffer.
Daniel Thompson (2):
trace: kdb: Fix kernel panic during ftdump
trace: kdb: Fix kernel livelock with empty buffers
kernel/trace/trace_kdb.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
--
1.9.3
Currently kdb's ftdump command will livelock by constantly printk'ing
the empty string at KERN_EMERG level if it run when the ftrace system is
not in use. This occurs because trace_empty() never returns false when
the ring buffers are left at the start of a non-consuming read [launched
by ring_buffer_read_start()].
This patch changes the loop exit condition to use the result of
trace_find_next_entry_inc(). Effectively this switches the non-consuming
kdb dumper to follow the approach of the non-consuming userspace
interface [s_next()] rather than the consuming ftrace_dump().
Signed-off-by: Daniel Thompson <[email protected]>
Cc: Jason Wessel <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
---
kernel/trace/trace_kdb.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/kernel/trace/trace_kdb.c b/kernel/trace/trace_kdb.c
index 8faa7ce58814..b0b1c44e923a 100644
--- a/kernel/trace/trace_kdb.c
+++ b/kernel/trace/trace_kdb.c
@@ -59,19 +59,19 @@ static void ftrace_dump_buf(int skip_lines, long cpu_file)
ring_buffer_read_start(iter.buffer_iter[cpu_file]);
tracing_iter_reset(&iter, cpu_file);
}
- if (!trace_empty(&iter))
- trace_find_next_entry_inc(&iter);
- while (!trace_empty(&iter)) {
+
+ while (trace_find_next_entry_inc(&iter)) {
if (!cnt)
kdb_printf("---------------------------------\n");
cnt++;
- if (trace_find_next_entry_inc(&iter) != NULL && !skip_lines)
+ if (!skip_lines) {
print_trace_line(&iter);
- if (!skip_lines)
trace_printk_seq(&iter.seq);
- else
+ } else {
skip_lines--;
+ }
+
if (KDB_FLAG(CMD_INTERRUPT))
goto out;
}
--
1.9.3
Currently kdb's ftdump command unconditionally crashes due to a null
pointer de-reference whenever the command is run. This in turn causes
the kernel to panic.
The abridged stacktrace (gathered with ARCH=arm) is:
-~- cut here -~-
[<c09535ac>] (panic) from [<c02132dc>] (die+0x264/0x440)
[<c02132dc>] (die) from [<c0952eb8>]
(__do_kernel_fault.part.11+0x74/0x84)
[<c0952eb8>] (__do_kernel_fault.part.11) from [<c021f954>]
(do_page_fault+0x1d0/0x3c4)
[<c021f954>] (do_page_fault) from [<c020846c>] (do_DataAbort+0x48/0xac)
[<c020846c>] (do_DataAbort) from [<c0213c58>] (__dabt_svc+0x38/0x60)
Exception stack(0xc0deba88 to 0xc0debad0)
ba80: e8c29180 00000001 e9854304 e9854300 c0f567d8
c0df2580
baa0: 00000000 00000000 00000000 c0f117b8 c0e3a3c0 c0debb0c 00000000
c0debad0
bac0: 0000672e c02f4d60 60000193 ffffffff
[<c0213c58>] (__dabt_svc) from [<c02f4d60>] (kdb_ftdump+0x1e4/0x3d8)
[<c02f4d60>] (kdb_ftdump) from [<c02ce328>] (kdb_parse+0x2b8/0x698)
[<c02ce328>] (kdb_parse) from [<c02ceef0>] (kdb_main_loop+0x52c/0x784)
[<c02ceef0>] (kdb_main_loop) from [<c02d1b0c>] (kdb_stub+0x238/0x490)
-~- cut here -~-
The NULL deref occurs due to the initialized use of struct trace_iter's
buffer_iter member.
This is a regression, albeit a fairly elderly one. It was introduced
by commit 6d158a813efc ("tracing: Remove NR_CPUS array from
trace_iterator").
This patch solves this by providing a collection of ring_buffer_iter(s)
and using this to initialize buffer_iter. Note that static allocation
is used solely because the trace_iter itself is also static allocated.
Static allocation also means that we have to NULL-ify the pointer during
cleanup to avoid use-after-free problems.
Signed-off-by: Daniel Thompson <[email protected]>
Cc: Jason Wessel <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
---
kernel/trace/trace_kdb.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_kdb.c b/kernel/trace/trace_kdb.c
index bd90e1b06088..8faa7ce58814 100644
--- a/kernel/trace/trace_kdb.c
+++ b/kernel/trace/trace_kdb.c
@@ -20,10 +20,12 @@ static void ftrace_dump_buf(int skip_lines, long cpu_file)
{
/* use static because iter can be a bit big for the stack */
static struct trace_iterator iter;
+ static struct ring_buffer_iter *buffer_iter[CONFIG_NR_CPUS];
unsigned int old_userobj;
int cnt = 0, cpu;
trace_init_global_iter(&iter);
+ iter.buffer_iter = buffer_iter;
for_each_tracing_cpu(cpu) {
atomic_inc(&per_cpu_ptr(iter.trace_buffer->data, cpu)->disabled);
@@ -86,9 +88,12 @@ out:
atomic_dec(&per_cpu_ptr(iter.trace_buffer->data, cpu)->disabled);
}
- for_each_tracing_cpu(cpu)
- if (iter.buffer_iter[cpu])
+ for_each_tracing_cpu(cpu) {
+ if (iter.buffer_iter[cpu]) {
ring_buffer_read_finish(iter.buffer_iter[cpu]);
+ iter.buffer_iter[cpu] = NULL;
+ }
+ }
}
/*
--
1.9.3
On Fri, 14 Nov 2014 09:08:21 +0000
Daniel Thompson <[email protected]> wrote:
> On 14/11/14 02:26, Steven Rostedt wrote:
> > On Thu, 6 Nov 2014 12:41:55 +0000
> > Daniel Thompson <[email protected]> wrote:
> >
> >> Currently kdb's ftdump command unconditionally crashes due to a null
> >> pointer de-reference whenever the command is run. This in turn causes
> >> the kernel to panic.
> >>
> >> The abridged stacktrace (gathered with ARCH=arm) is:
> >> --- cut here ---
> >
> > This is a nasty line. "git am " truncated the change log there thinking
> > it was the end of the change log (that '---' did it).
>
> Oops. Looks like I shall have to train myself out of that particular
> habit (which will be tough, since I've used it long enough to be ingrained).
>
> Will repost as soon as I've thought about your other comments.
I was just telling you so that you wouldn't do it again. I already
pulled in your patches. You didn't need to resend. But at least it
would help you get out of that habit! J
-- Steve
On 14/11/14 11:59, Steven Rostedt wrote:
> On Fri, 14 Nov 2014 09:08:21 +0000
> Daniel Thompson <[email protected]> wrote:
>
>> On 14/11/14 02:26, Steven Rostedt wrote:
>>> On Thu, 6 Nov 2014 12:41:55 +0000
>>> Daniel Thompson <[email protected]> wrote:
>>>
>>>> Currently kdb's ftdump command unconditionally crashes due to a null
>>>> pointer de-reference whenever the command is run. This in turn causes
>>>> the kernel to panic.
>>>>
>>>> The abridged stacktrace (gathered with ARCH=arm) is:
>>>> --- cut here ---
>>>
>>> This is a nasty line. "git am " truncated the change log there thinking
>>> it was the end of the change log (that '---' did it).
>>
>> Oops. Looks like I shall have to train myself out of that particular
>> habit (which will be tough, since I've used it long enough to be ingrained).
>>
>> Will repost as soon as I've thought about your other comments.
>
> I was just telling you so that you wouldn't do it again. I already
> pulled in your patches. You didn't need to resend. But at least it
> would help you get out of that habit! J
Thanks.
BTW I did check on kernel.org before punting them back on the list:
https://git.kernel.org/cgit/linux/kernel/git/rostedt/linux-trace.git/log/?h=for-next
Am I looking in the wrong place or are they not aimed at linux-next just
yet?
Daniel.
On Fri, 14 Nov 2014 12:13:51 +0000
Daniel Thompson <[email protected]> wrote:
> BTW I did check on kernel.org before punting them back on the list:
> https://git.kernel.org/cgit/linux/kernel/git/rostedt/linux-trace.git/log/?h=for-next
>
> Am I looking in the wrong place or are they not aimed at linux-next just
> yet?
That's my repo, but I don't push up there till they pass all my tests.
That takes 8 hours to run (I usually run them overnight). I also run
them through cross compile configs as well, which I haven't done yet.
I send out [for-next] patches for all patches I push to that repo.
You'll see them when I do that.
Thanks,
-- Steve
On Fri, 14 Nov 2014 07:31:46 -0500
Steven Rostedt <[email protected]> wrote:
> On Fri, 14 Nov 2014 12:13:51 +0000
> Daniel Thompson <[email protected]> wrote:
>
> > BTW I did check on kernel.org before punting them back on the list:
> > https://git.kernel.org/cgit/linux/kernel/git/rostedt/linux-trace.git/log/?h=for-next
> >
> > Am I looking in the wrong place or are they not aimed at linux-next just
> > yet?
>
If you want to see what I'm testing. I usually push to ftrace/core
before kicking them off. ftrace/core can rebase, so you need to be
careful with doing work on top of that.
Although, I noticed I forgot to push to my repo last night.
-- Steve