2008-07-24 13:24:26

by Francis Moreau

[permalink] [raw]
Subject: KGDB fails to pass selft tests on x86-64 (v2.6.26)

Hello.
I wanted to give kgdb a try on a v2.6.26 kernel. My cpu is anx86-64.
So the first thing I did is to enable kgdb support and also thekgdb self tests:
CONFIG_HAVE_ARCH_KGDB=yCONFIG_KGDB=yCONFIG_KGDB_SERIAL_CONSOLE=yCONFIG_KGDB_TESTS=y# CONFIG_KGDB_TESTS_ON_BOOT is not set
Once compiled I booted this kernel through qemu and got the following:
...kgdb: Registered I/O driver kgdbts.kgdb: Waiting for connection from remote gdb...kgdbts:RUN plant and detach testkgdbts:RUN sw breakpoint testkgdbts:RUN bad memory access testkgdbts:RUN singlestep test 1000 iterationskgdbts:RUN singlestep [0/1000]kgdbts:RUN singlestep [100/1000]kgdbts:RUN singlestep [200/1000]kgdbts:RUN singlestep [300/1000]kgdbts:RUN singlestep [400/1000]kgdbts:RUN singlestep [500/1000]kgdbts:RUN singlestep [600/1000]kgdbts:RUN singlestep [700/1000]kgdbts:RUN singlestep [800/1000]kgdbts:RUN singlestep [900/1000]kgdbts:RUN hw breakpoint testkgdbts: BP mismatch ffffffff810677e3 expected ffffffff811b4e1f------------[ cut here ]------------WARNING: at drivers/misc/kgdbts.c:302 check_and_rewind_pc+0xbf/0xde()Modules linked in:Pid: 1, comm: swapper Tainted: G W 2.6.26 #9
Call Trace: <#DB> [<ffffffff8102f607>] warn_on_slowpath+0x58/0x94 [<ffffffff810677e3>] ? kgdb_breakpoint+0x12/0x21 [<ffffffff811b4e1f>] ? kgdbts_break_test+0x0/0x22 [<ffffffff811b4ce8>] ? fill_get_buf+0xc9/0xd1 [<ffffffff811b4e1f>] ? kgdbts_break_test+0x0/0x22 [<ffffffff81073608>] ? probe_kernel_write+0x38/0x68 [<ffffffff810677e3>] ? kgdb_breakpoint+0x12/0x21 [<ffffffff811b4e1f>] ? kgdbts_break_test+0x0/0x22 [<ffffffff811b554e>] check_and_rewind_pc+0xbf/0xde [<ffffffff811b4a69>] validate_simple_test+0x25/0x75 [<ffffffff811b50df>] run_simple_test+0x1e1/0x255 [<ffffffff811b4aee>] kgdbts_put_char+0x18/0x1a [<ffffffff81067616>] put_packet+0x79/0xd9 [<ffffffff81068c39>] kgdb_handle_exception+0xdd3/0xf02 [<ffffffff81018e48>] ? touch_nmi_watchdog+0x5e/0x62 [<ffffffff811b4e1f>] ? kgdbts_break_test+0x0/0x22 [<ffffffff8101c34a>] kgdb_notify+0x14c/0x166 [<ffffffff812aa477>] notifier_call_chain+0x33/0x5b [<ffffffff812aa4c1>] atomic_notifier_call_chain+0x13/0x15 [<ffffffff81045900>] notify_die+0x2e/0x30 [<ffffffff812a8aa5>] do_int3+0x32/0x8f [<ffffffff812a82a3>] int3+0x93/0xb0 [<ffffffff810677e3>] ? kgdb_breakpoint+0x12/0x21 <<EOE>> [<ffffffff811b5267>] ? run_breakpoint_test+0x5b/0x9c [<ffffffff811b5a20>] ? configure_kgdbts+0x254/0x45d [<ffffffff81138399>] ? blk_register_region+0x2a/0x2c [<ffffffff8147ffb6>] ? init_kgdbts+0x0/0x16 [<ffffffff8147ffca>] ? init_kgdbts+0x14/0x16 [<ffffffff814606d3>] ? kernel_init+0x15f/0x2b3 [<ffffffff8100ccf8>] ? child_rip+0xa/0x12 [<ffffffff81460574>] ? kernel_init+0x0/0x2b3 [<ffffffff8100ccee>] ? child_rip+0x0/0x12
---[ end trace 4eaa2a86a8e2da22 ]---kgdbts: ERROR PUT: end of test buffer on 'hw_breakpoint_test' line 3expected kgdbts_break_test got$d8fa4a81ffffffff0090b57f0081ffff000000000000000001000000000000001000000000000000286a5481ffffffff201e8a0f0081ffff201e8a0f0081ffffc01c8a0f0081ffff��J�����------------[ cut here ]------------WARNING: at drivers/misc/kgdbts.c:721 run_simple_test+0x22b/0x255()Modules linked in:Pid: 1, comm: swapper Tainted: G W 2.6.26 #9
Call Trace: <#DB> [<ffffffff8102f607>] warn_on_slowpath+0x58/0x94 [<ffffffff811b4ce8>] ? fill_get_buf+0xc9/0xd1 [<ffffffff81073608>] ? probe_kernel_write+0x38/0x68 [<ffffffff810677e3>] ? kgdb_breakpoint+0x12/0x21 [<ffffffff811b4e1f>] ? kgdbts_break_test+0x0/0x22 [<ffffffff811b554e>] ? check_and_rewind_pc+0xbf/0xde [<ffffffff811b5129>] run_simple_test+0x22b/0x255 [<ffffffff811b4aee>] kgdbts_put_char+0x18/0x1a [<ffffffff81067616>] put_packet+0x79/0xd9 [<ffffffff81068c39>] kgdb_handle_exception+0xdd3/0xf02 [<ffffffff81018e48>] ? touch_nmi_watchdog+0x5e/0x62 [<ffffffff811b4e1f>] ? kgdbts_break_test+0x0/0x22 [<ffffffff8101c34a>] kgdb_notify+0x14c/0x166 [<ffffffff812aa477>] notifier_call_chain+0x33/0x5b [<ffffffff812aa4c1>] atomic_notifier_call_chain+0x13/0x15 [<ffffffff81045900>] notify_die+0x2e/0x30 [<ffffffff812a8aa5>] do_int3+0x32/0x8f [<ffffffff812a82a3>] int3+0x93/0xb0 [<ffffffff810677e3>] ? kgdb_breakpoint+0x12/0x21 <<EOE>> [<ffffffff811b5267>] ? run_breakpoint_test+0x5b/0x9c [<ffffffff811b5a20>] ? configure_kgdbts+0x254/0x45d [<ffffffff81138399>] ? blk_register_region+0x2a/0x2c [<ffffffff8147ffb6>] ? init_kgdbts+0x0/0x16 [<ffffffff8147ffca>] ? init_kgdbts+0x14/0x16 [<ffffffff814606d3>] ? kernel_init+0x15f/0x2b3 [<ffffffff8100ccf8>] ? child_rip+0xa/0x12 [<ffffffff81460574>] ? kernel_init+0x0/0x2b3 [<ffffffff8100ccee>] ? child_rip+0x0/0x12
---[ end trace 4eaa2a86a8e2da22 ]---kgdbts: ERROR hw_breakpoint_test test failed------------[ cut here ]------------WARNING: at drivers/misc/kgdbts.c:783 run_breakpoint_test+0x8a/0x9c()Modules linked in:Pid: 1, comm: swapper Tainted: G W 2.6.26 #9
Call Trace: [<ffffffff8102f607>] warn_on_slowpath+0x58/0x94 [<ffffffff810677e3>] ? kgdb_breakpoint+0x12/0x21 [<ffffffff811b5296>] run_breakpoint_test+0x8a/0x9c [<ffffffff811b5a20>] configure_kgdbts+0x254/0x45d [<ffffffff81138399>] ? blk_register_region+0x2a/0x2c [<ffffffff8147ffb6>] ? init_kgdbts+0x0/0x16 [<ffffffff8147ffca>] init_kgdbts+0x14/0x16 [<ffffffff814606d3>] kernel_init+0x15f/0x2b3 [<ffffffff8100ccf8>] child_rip+0xa/0x12 [<ffffffff81460574>] ? kernel_init+0x0/0x2b3 [<ffffffff8100ccee>] ? child_rip+0x0/0x12
---[ end trace 4eaa2a86a8e2da22 ]---kgdbts:RUN hw write breakpoint testkgdbts: ERROR hw_write_break_test test failed------------[ cut here ]------------WARNING: at drivers/misc/kgdbts.c:815 run_hw_break_test+0xd4/0xe3()Modules linked in:Pid: 1, comm: swapper Tainted: G W 2.6.26 #9
Call Trace: [<ffffffff8102f607>] warn_on_slowpath+0x58/0x94 [<ffffffff810677e3>] ? kgdb_breakpoint+0x12/0x21 [<ffffffff811b537c>] run_hw_break_test+0xd4/0xe3 [<ffffffff811b5a41>] configure_kgdbts+0x275/0x45d [<ffffffff81138399>] ? blk_register_region+0x2a/0x2c [<ffffffff8147ffb6>] ? init_kgdbts+0x0/0x16 [<ffffffff8147ffca>] init_kgdbts+0x14/0x16 [<ffffffff814606d3>] kernel_init+0x15f/0x2b3 [<ffffffff8100ccf8>] child_rip+0xa/0x12 [<ffffffff81460574>] ? kernel_init+0x0/0x2b3 [<ffffffff8100ccee>] ? child_rip+0x0/0x12
---[ end trace 4eaa2a86a8e2da22 ]---kgdbts:RUN access write breakpoint testkgdbts: ERROR hw_access_break_test test failed------------[ cut here ]------------WARNING: at drivers/misc/kgdbts.c:815 run_hw_break_test+0xd4/0xe3()Modules linked in:Pid: 1, comm: swapper Tainted: G W 2.6.26 #9
Call Trace: [<ffffffff8102f607>] warn_on_slowpath+0x58/0x94 [<ffffffff811b537c>] run_hw_break_test+0xd4/0xe3 [<ffffffff811b5a5f>] configure_kgdbts+0x293/0x45d [<ffffffff81138399>] ? blk_register_region+0x2a/0x2c [<ffffffff8147ffb6>] ? init_kgdbts+0x0/0x16 [<ffffffff8147ffca>] init_kgdbts+0x14/0x16 [<ffffffff814606d3>] kernel_init+0x15f/0x2b3 [<ffffffff8100ccf8>] child_rip+0xa/0x12 [<ffffffff81460574>] ? kernel_init+0x0/0x2b3 [<ffffffff8100ccee>] ? child_rip+0x0/0x12
---[ end trace 4eaa2a86a8e2da22 ]---kgdb: Unregistered I/O driver kgdbts, debugger disabled.
I assumed that the tests failed, so I thought that the best thing todo was to postthis on LKML.
So the question is "what's wrong ?"
Thanks-- Francis????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?


2008-07-24 14:08:21

by Francis Moreau

[permalink] [raw]
Subject: Re: KGDB fails to pass selft tests on x86-64 (v2.6.26)

On Thu, Jul 24, 2008 at 3:24 PM, Francis Moreau <[email protected]> wrote:
> I wanted to give kgdb a try on a v2.6.26 kernel. My cpu is an
> x86-64.
>
> So the first thing I did is to enable kgdb support and also the
> kgdb self tests:
>
> CONFIG_HAVE_ARCH_KGDB=y
> CONFIG_KGDB=y
> CONFIG_KGDB_SERIAL_CONSOLE=y
> CONFIG_KGDB_TESTS=y
> # CONFIG_KGDB_TESTS_ON_BOOT is not set
>
> Once compiled I booted this kernel through qemu and got the following:
>

and unfortunately it seems this issue is related to qemu: if I did boot the
kernel without qemu and the self tests passed...


--
Francis

2008-07-24 14:30:39

by Jason Wessel

[permalink] [raw]
Subject: Re: KGDB fails to pass selft tests on x86-64 (v2.6.26)

Francis Moreau wrote:
> On Thu, Jul 24, 2008 at 3:24 PM, Francis Moreau <[email protected]> wrote:
>
>> I wanted to give kgdb a try on a v2.6.26 kernel. My cpu is an
>> x86-64.
>>
>> So the first thing I did is to enable kgdb support and also the
>> kgdb self tests:
>>
>> CONFIG_HAVE_ARCH_KGDB=y
>> CONFIG_KGDB=y
>> CONFIG_KGDB_SERIAL_CONSOLE=y
>> CONFIG_KGDB_TESTS=y
>> # CONFIG_KGDB_TESTS_ON_BOOT is not set
>>
>> Once compiled I booted this kernel through qemu and got the following:
>>
>>
>
> and unfortunately it seems this issue is related to qemu: if I did boot the
> kernel without qemu and the self tests passed...
>
>
>

It is because the qemu you are using does not support hardware
breakpoints. The tests will warn, but the kernel will continue to boot
and tell you that hw breakpoint support doesn't work on your simulated
hardware.

Also, if you plan to use software breakpoints, please make sure to turn
off CONFIG_DEBUG_RODATA, else they will not work. This is a regression
which will is on my list of things to take a look at, as time permits.

Cheers,
Jason.

2008-07-24 14:38:41

by Vegard Nossum

[permalink] [raw]
Subject: Re: KGDB fails to pass selft tests on x86-64 (v2.6.26)

On Thu, Jul 24, 2008 at 4:30 PM, Jason Wessel
<[email protected]> wrote:
> Francis Moreau wrote:
>> On Thu, Jul 24, 2008 at 3:24 PM, Francis Moreau <[email protected]> wrote:
>>> So the first thing I did is to enable kgdb support and also the
>>> kgdb self tests:
>>>
>>> CONFIG_HAVE_ARCH_KGDB=y
>>> CONFIG_KGDB=y
>>> CONFIG_KGDB_SERIAL_CONSOLE=y
>>> CONFIG_KGDB_TESTS=y
>>> # CONFIG_KGDB_TESTS_ON_BOOT is not set
>>>
>>> Once compiled I booted this kernel through qemu and got the following:
>>>
>> and unfortunately it seems this issue is related to qemu: if I did boot the
>> kernel without qemu and the self tests passed...
>>
>>
>>
>
> It is because the qemu you are using does not support hardware
> breakpoints. The tests will warn, but the kernel will continue to boot
> and tell you that hw breakpoint support doesn't work on your simulated
> hardware.
>
> Also, if you plan to use software breakpoints, please make sure to turn
> off CONFIG_DEBUG_RODATA, else they will not work. This is a regression
> which will is on my list of things to take a look at, as time permits.

I had a couple of kernels hang during single-stepping self-tests (I
think), where it reached 500 and 900 tests respectively before it hung
hard (NMI watchdog enabled, but not triggering). Is this related to
the RODATA thing?

This was with a recent (post-v2.6.26) kernel on a real P4.


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2008-07-24 14:56:00

by Jason Wessel

[permalink] [raw]
Subject: Re: KGDB fails to pass selft tests on x86-64 (v2.6.26)

Vegard Nossum wrote:
> On Thu, Jul 24, 2008 at 4:30 PM, Jason Wessel
> <[email protected]> wrote:
>
>> Also, if you plan to use software breakpoints, please make sure to turn
>> off CONFIG_DEBUG_RODATA, else they will not work. This is a regression
>> which will is on my list of things to take a look at, as time permits.
>>
>
> I had a couple of kernels hang during single-stepping self-tests (I
> think), where it reached 500 and 900 tests respectively before it hung
> hard (NMI watchdog enabled, but not triggering). Is this related to
> the RODATA thing?
>
> This was with a recent (post-v2.6.26) kernel on a real P4.
>
>
>

It is not likely that CONFIG_DEBUG_RODATA can have any impact because
the boot test occurs before the text sections are marked read-only. A
hang in this section indicates a conflict where something is spinning
for a lock in the NMI handler, or the NMI handler re-entered and tried
to acquire another lock.

In the 2.6.26 time frame I found and fixed at least one defect around
updating the clock while in the NMI which required that you not do it
from the NMI context. It sounds as if there is yet another problem
along these lines, and of course it is a timing race...

Jason.

2008-07-24 15:20:03

by Francis Moreau

[permalink] [raw]
Subject: Re: KGDB fails to pass selft tests on x86-64 (v2.6.26)

Hello,

On Thu, Jul 24, 2008 at 4:30 PM, Jason Wessel
<[email protected]> wrote:
>
> It is because the qemu you are using does not support hardware
> breakpoints. The tests will warn, but the kernel will continue to boot
> and tell you that hw breakpoint support doesn't work on your simulated
> hardware.

ah ok, thanks.

While I have your attention, I sometimes (well pretty often actually) got this
inside my gdb session:

(gdb) n
warning: Invalid remote reply: 00

or

(gdb) n
warning: Invalid remote reply: 00
warning: Invalid remote reply:
c0dcb40f0081ffff08e59f0f0081ffff00b1a40f0081ffff8086400f0081ffff704e2c81ffffffff01000000008000000d00000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002000000000000000ffffffffffffffff0000000000000000

Do you any idea what's wrong in this case ?

>
> Also, if you plan to use software breakpoints, please make sure to turn
> off CONFIG_DEBUG_RODATA, else they will not work. This is a regression
> which will is on my list of things to take a look at, as time permits.
>

Yes I did that since I did notice a previous post where you adviced the same
thing.

Thanks
--
Francis

2008-07-24 19:05:28

by Vegard Nossum

[permalink] [raw]
Subject: Re: KGDB fails to pass selft tests on x86-64 (v2.6.26)

On Thu, Jul 24, 2008 at 4:55 PM, Jason Wessel
<[email protected]> wrote:
> It is not likely that CONFIG_DEBUG_RODATA can have any impact because
> the boot test occurs before the text sections are marked read-only. A
> hang in this section indicates a conflict where something is spinning
> for a lock in the NMI handler, or the NMI handler re-entered and tried
> to acquire another lock.
>
> In the 2.6.26 time frame I found and fixed at least one defect around
> updating the clock while in the NMI which required that you not do it
> from the NMI context. It sounds as if there is yet another problem
> along these lines, and of course it is a timing race...

I just got another one now, with HEAD at
f0766440dda7ace8a43b030f75e2dcb82449fb85:

calling init_kgdbts+0x0/0x20
kgdb: Registered I/O driver kgdbts.
kgdbts:RUN plant and detach test
kgdbts:RUN sw breakpoint test
kgdbts:RUN bad memory access test
kgdbts:RUN singlestep test 1000 iterations
kgdbts:RUN singlestep [0/1000]
kgdbts:RUN singlestep [100/1000]
kgdbts:RUN singlestep [200/1000]
kgdbts:RUN singlestep [300/1000]
kgdbts:RUN singlestep [400/1000]
kgdbts:RUN singlestep [500/1000]
kgdbts:RUN singlestep [600/1000]

Full stop. But the commit has timestamp May 9, 2008, so maybe you
fixed it after that.


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036