2010-08-03 07:22:13

by Hui Zhu

[permalink] [raw]
Subject: [PATCH] kexec: set prstatus.pr_pid to cpu id when current->pid is 0

Hi,

I found that from gdb 7.1 to gdb-cvs-head cannot analyze the core file
that get from kdump.
What I got:
[New <main task>]
[New Thread 2719]
../../src/gdb/thread.c:884: internal-error: switch_to_thread:
Assertion `inf != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)
That is because:
objdump -h ./vmcore

./vmcore: file format elf64-x86-64

Sections:
Idx Name Size VMA LMA File off Algn
0 note0 00000a48 0000000000000000 0000000000000000 00000238 2**0
CONTENTS, READONLY
1 .reg/0 000000d8 0000000000000000 0000000000000000 000002bc 2**2
CONTENTS
2 .reg 000000d8 0000000000000000 0000000000000000 000002bc 2**2
CONTENTS
3 .reg/2719 000000d8 0000000000000000 0000000000000000 00000420 2**2
CONTENTS
4 .reg/0 000000d8 0000000000000000 0000000000000000 00000584 2**2
CONTENTS
5 .reg/0 000000d8 0000000000000000 0000000000000000 000006e8 2**2
CONTENTS
Each of reg/n is a cpu core note. It will be a GDB thread. n is the
prstatus.pr_pid that will be the thread lwpid. Because the 3 threads
pid is same, so GDB get error.

current->pid is 0 because this cpu is in idle. So I add a check, set
prstatus.pr_pid to cpu id when current->pid is 0. Then GDB work OK
with the core.

Thanks,
Hui

Signed-off-by: Hui Zhu <[email protected]>
---
kernel/kexec.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -1191,7 +1191,10 @@ void crash_save_cpu(struct pt_regs *regs
if (!buf)
return;
memset(&prstatus, 0, sizeof(prstatus));
- prstatus.pr_pid = current->pid;
+ if (current->pid)
+ prstatus.pr_pid = current->pid;
+ else
+ prstatus.pr_pid = cpu;
elf_core_copy_kernel_regs(&prstatus.pr_reg, regs);
buf = append_elf_note(buf, KEXEC_CORE_NOTE_NAME, NT_PRSTATUS,
&prstatus, sizeof(prstatus));


2010-08-03 07:37:56

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] kexec: set prstatus.pr_pid to cpu id when current->pid is 0

Hui Zhu <[email protected]> writes:

> Hi,
>
> I found that from gdb 7.1 to gdb-cvs-head cannot analyze the core file
> that get from kdump.
> What I got:
> [New <main task>]
> [New Thread 2719]
> ../../src/gdb/thread.c:884: internal-error: switch_to_thread:
> Assertion `inf != NULL' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Quit this debugging session? (y or n)
> That is because:
> objdump -h ./vmcore
>
> ./vmcore: file format elf64-x86-64
>
> Sections:
> Idx Name Size VMA LMA File off Algn
> 0 note0 00000a48 0000000000000000 0000000000000000 00000238 2**0
> CONTENTS, READONLY
> 1 .reg/0 000000d8 0000000000000000 0000000000000000 000002bc 2**2
> CONTENTS
> 2 .reg 000000d8 0000000000000000 0000000000000000 000002bc 2**2
> CONTENTS
> 3 .reg/2719 000000d8 0000000000000000 0000000000000000 00000420 2**2
> CONTENTS
> 4 .reg/0 000000d8 0000000000000000 0000000000000000 00000584 2**2
> CONTENTS
> 5 .reg/0 000000d8 0000000000000000 0000000000000000 000006e8 2**2
> CONTENTS
> Each of reg/n is a cpu core note. It will be a GDB thread. n is the
> prstatus.pr_pid that will be the thread lwpid. Because the 3 threads
> pid is same, so GDB get error.
>
> current->pid is 0 because this cpu is in idle. So I add a check, set
> prstatus.pr_pid to cpu id when current->pid is 0. Then GDB work OK
> with the core.

That is a gdb limitation. It looks to me like applying this patch will
loose information, and give you no guarantee that prstatus.pr_pid will
not equal 0.

If you want to change something please do it in a post processing tool.

Eric


> Thanks,
> Hui
>
> Signed-off-by: Hui Zhu <[email protected]>
> ---
> kernel/kexec.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> --- a/kernel/kexec.c
> +++ b/kernel/kexec.c
> @@ -1191,7 +1191,10 @@ void crash_save_cpu(struct pt_regs *regs
> if (!buf)
> return;
> memset(&prstatus, 0, sizeof(prstatus));
> - prstatus.pr_pid = current->pid;
> + if (current->pid)
> + prstatus.pr_pid = current->pid;
> + else
> + prstatus.pr_pid = cpu;
> elf_core_copy_kernel_regs(&prstatus.pr_reg, regs);
> buf = append_elf_note(buf, KEXEC_CORE_NOTE_NAME, NT_PRSTATUS,
> &prstatus, sizeof(prstatus));

2010-08-03 07:44:58

by Hui Zhu

[permalink] [raw]
Subject: Re: [PATCH] kexec: set prstatus.pr_pid to cpu id when current->pid is 0

On Tue, Aug 3, 2010 at 15:37, Eric W. Biederman <[email protected]> wrote:
> Hui Zhu <[email protected]> writes:
>
>> Hi,
>>
>> I found that from gdb 7.1 to gdb-cvs-head cannot analyze the core file
>> that get from kdump.
>> What I got:
>> [New <main task>]
>> [New Thread 2719]
>> ../../src/gdb/thread.c:884: internal-error: switch_to_thread:
>> Assertion `inf != NULL' failed.
>> A problem internal to GDB has been detected,
>> further debugging may prove unreliable.
>> Quit this debugging session? (y or n)
>> That is because:
>> ?objdump -h ./vmcore
>>
>> ./vmcore: ? ? file format elf64-x86-64
>>
>> Sections:
>> Idx Name ? ? ? ? ?Size ? ? ?VMA ? ? ? ? ? ? ? LMA ? ? ? ? ? ? ? File off ?Algn
>> ? 0 note0 ? ? ? ? 00000a48 ?0000000000000000 ?0000000000000000 ?00000238 ?2**0
>> ? ? ? ? ? ? ? ? ? CONTENTS, READONLY
>> ? 1 .reg/0 ? ? ? ?000000d8 ?0000000000000000 ?0000000000000000 ?000002bc ?2**2
>> ? ? ? ? ? ? ? ? ? CONTENTS
>> ? 2 .reg ? ? ? ? ?000000d8 ?0000000000000000 ?0000000000000000 ?000002bc ?2**2
>> ? ? ? ? ? ? ? ? ? CONTENTS
>> ? 3 .reg/2719 ? ? 000000d8 ?0000000000000000 ?0000000000000000 ?00000420 ?2**2
>> ? ? ? ? ? ? ? ? ? CONTENTS
>> ? 4 .reg/0 ? ? ? ?000000d8 ?0000000000000000 ?0000000000000000 ?00000584 ?2**2
>> ? ? ? ? ? ? ? ? ? CONTENTS
>> ? 5 .reg/0 ? ? ? ?000000d8 ?0000000000000000 ?0000000000000000 ?000006e8 ?2**2
>> ? ? ? ? ? ? ? ? ? CONTENTS
>> Each of reg/n is a cpu core note. ?It will be a GDB thread. ?n is the
>> prstatus.pr_pid that will be the thread lwpid. ?Because the 3 threads
>> pid is same, so GDB get error.
>>
>> current->pid is 0 because this cpu is in idle. ?So I add a check, set
>> prstatus.pr_pid to cpu id when current->pid is 0. ?Then GDB work OK
>> with the core.
>
> That is a gdb limitation. ?It looks to me like applying this patch will
> loose information, and give you no guarantee that prstatus.pr_pid will
> not equal 0.
>
> If you want to change something please do it in a post processing tool.
>
> Eric

Equal 0 is not a bug, the trouble is a lot of core's pid is same.

This is what gdb say:
/* Found an old thread with the same id. It has to be dead,
otherwise we wouldn't be adding a new thread with the same id.
The OS is reusing this id --- delete it, and recreate a new
one. */

Hui

>
>
>> Thanks,
>> Hui
>>
>> Signed-off-by: Hui Zhu <[email protected]>
>> ---
>> ?kernel/kexec.c | ? ?5 ++++-
>> ?1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> --- a/kernel/kexec.c
>> +++ b/kernel/kexec.c
>> @@ -1191,7 +1191,10 @@ void crash_save_cpu(struct pt_regs *regs
>> ? ? ? if (!buf)
>> ? ? ? ? ? ? ? return;
>> ? ? ? memset(&prstatus, 0, sizeof(prstatus));
>> - ? ? prstatus.pr_pid = current->pid;
>> + ? ? if (current->pid)
>> + ? ? ? ? ? ? prstatus.pr_pid = current->pid;
>> + ? ? else
>> + ? ? ? ? ? ? prstatus.pr_pid = cpu;
>> ? ? ? elf_core_copy_kernel_regs(&prstatus.pr_reg, regs);
>> ? ? ? buf = append_elf_note(buf, KEXEC_CORE_NOTE_NAME, NT_PRSTATUS,
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? &prstatus, sizeof(prstatus));
>

2010-08-03 08:15:15

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] kexec: set prstatus.pr_pid to cpu id when current->pid is 0

Hui Zhu <[email protected]> writes:

> On Tue, Aug 3, 2010 at 15:37, Eric W. Biederman <[email protected]> wrote:
>> Hui Zhu <[email protected]> writes:
>>
>
> Equal 0 is not a bug, the trouble is a lot of core's pid is same.
>
> This is what gdb say:
> /* Found an old thread with the same id. It has to be dead,
> otherwise we wouldn't be adding a new thread with the same id.
> The OS is reusing this id --- delete it, and recreate a new
> one. */

gdb bug compatibility is not a primary goal. Having an extensible
format and not inventing it totally out of the blue is the goal.

The goal was always that something could post process the output of
the kernel crashdump and create something that is gdb compatible. It
looks to me like it would take just a moment to strip out all of the
idle threads.

Claiming the pid is the cpu number when the pid is the idle pid gives
you no insulation against duplication, and it looses information.

Eric

2010-08-03 08:36:54

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH] kexec: set prstatus.pr_pid to cpu id when current->pid is 0

On Tue, Aug 03, 2010 at 01:15:04AM -0700, Eric W. Biederman wrote:
> Hui Zhu <[email protected]> writes:
>
> > On Tue, Aug 3, 2010 at 15:37, Eric W. Biederman <[email protected]> wrote:
> >> Hui Zhu <[email protected]> writes:
> >>
> >
> > Equal 0 is not a bug, the trouble is a lot of core's pid is same.
> >
> > This is what gdb say:
> > /* Found an old thread with the same id. It has to be dead,
> > otherwise we wouldn't be adding a new thread with the same id.
> > The OS is reusing this id --- delete it, and recreate a new
> > one. */
>
> gdb bug compatibility is not a primary goal. Having an extensible
> format and not inventing it totally out of the blue is the goal.
>
> The goal was always that something could post process the output of
> the kernel crashdump and create something that is gdb compatible. It
> looks to me like it would take just a moment to strip out all of the
> idle threads.
>
> Claiming the pid is the cpu number when the pid is the idle pid gives
> you no insulation against duplication, and it looses information.

Agreed, there clearly an ambiguity brought in by this patch as the range
of valid values for pids and cpus is essentially the same.

Doing this in user-space is the right place, though I'm not really
convinced its even correct there.