2022-02-01 08:47:12

by Vasyl Vavrychuk

[permalink] [raw]
Subject: gdb switches to __sysvec_apic_timer_interrupt or __default_send_IPI_dest_field with KVM enabled

Hello,

I run Linux kernel under qemu-system-x86_64 via the "-kernel" option.

Also, I added the "-s" option to accept the gdb connection.

After Linux boot up I connect with gdb and set a breakpoint in some
function, for example "device_del", does not matter really.

The problem is if I also use "--enable-kvm", then after breakpoint
triggered and sending "n" from gdb, it switches to

__sysvec_apic_timer_interrupt (regs=0xffffc90000297de8) at
arch/x86/kernel/apic/apic.c:1102
1102 trace_local_timer_entry(LOCAL_TIMER_VECTOR);

or to

__default_send_IPI_dest_field (mask=<optimized out>,
vector=<optimized out>, dest=dest@entry=2048) at
arch/x86/kernel/apic/ipi.c:161
161 cfg = __prepare_ICR2(mask);

I am stepping over kernel code that does not perform any waiting or blocking.

Everything works fine with "--enable-kvm" removed.

Thanks,
Vasyl


2022-02-01 16:24:52

by Maxim Levitsky

[permalink] [raw]
Subject: Re: gdb switches to __sysvec_apic_timer_interrupt or __default_send_IPI_dest_field with KVM enabled

On Sat, 2022-01-29 at 23:06 +0200, Vasyl Vavrychuk wrote:
> Hello,
>
> I run Linux kernel under qemu-system-x86_64 via the "-kernel" option.
>
> Also, I added the "-s" option to accept the gdb connection.
>
> After Linux boot up I connect with gdb and set a breakpoint in some
> function, for example "device_del", does not matter really.
>
> The problem is if I also use "--enable-kvm", then after breakpoint
> triggered and sending "n" from gdb, it switches to
>
> __sysvec_apic_timer_interrupt (regs=0xffffc90000297de8) at
> arch/x86/kernel/apic/apic.c:1102
> 1102 trace_local_timer_entry(LOCAL_TIMER_VECTOR);
>
> or to
>
> __default_send_IPI_dest_field (mask=<optimized out>,
> vector=<optimized out>, dest=dest@entry=2048) at
> arch/x86/kernel/apic/ipi.c:161
> 161 cfg = __prepare_ICR2(mask);
>
> I am stepping over kernel code that does not perform any waiting or blocking.
>
> Everything works fine with "--enable-kvm" removed.

I recently fixed that, and the code AFAIK is upstream, but probably, the qemu
side of it didn't yet made it to the release.

The problem you are seeing is that every time you single step, an interrupt
occures because you are not as fast as computer is - timer interrupt happens
like 1000 times in a second, so after each single step you do it will be pending.

That makes GDB land you in the interrupt handler, which is correct
technically but makes single stepping pretty much impossible.

The solution is to tell kernel to mask interrupts regardless
if they are masked by the guest, something that qemu even does when TCCG
is used but was not implemented for KVM.

Best regards,
Maxim Levitsky

PS: you might also want to patch kernel's lx-symbols gdb script to fix loadable module support,
which currently doesn't work well - I run out of time to upstream it, I'll get to it
someday.

There problem here is that kernel's gdb script uses a breakpoint in the function that
loads modules and when it hits, it reloads gdb symbols - that is frowned upon in gdb docs,
but pretty much the only way to do it.

I patched the lx-symbols script to at least work with recent gdb, but this no doubt relies on at least some undefined
behavier in gdb, therefore I didn't push this futher.

https://patchwork.kernel.org/project/kvm/patch/[email protected]/



>
> Thanks,
> Vasyl
>


2022-02-08 15:08:22

by Vasyl Vavrychuk

[permalink] [raw]
Subject: Re: gdb switches to __sysvec_apic_timer_interrupt or __default_send_IPI_dest_field with KVM enabled

Thanks a lot for these fixes which I can use, and for detailed explanation.

On Mon, Jan 31, 2022 at 12:42 PM Maxim Levitsky <[email protected]> wrote:
> I recently fixed that, and the code AFAIK is upstream, but probably, the qemu
> side of it didn't yet made it to the release.

You are right, I have observed some unrelated gdb issue when debugging
kernel under QEMU and prepared packaging backport:
https://salsa.debian.org/gdb-team/gdb/-/merge_requests/9

> I patched the lx-symbols script to at least work with recent gdb, but this no doubt relies on at least some undefined
> behavier in gdb, therefore I didn't push this futher.
>
> https://patchwork.kernel.org/project/kvm/patch/[email protected]/

What a coincidence, I use lx-symbols with an external kernel module. I
have noticed that it behaves strangely sometimes, but somehow I found
a proper order of comments when it works for me.

On Mon, Jan 31, 2022 at 12:42 PM Maxim Levitsky <[email protected]> wrote:
>
> On Sat, 2022-01-29 at 23:06 +0200, Vasyl Vavrychuk wrote:
> > Hello,
> >
> > I run Linux kernel under qemu-system-x86_64 via the "-kernel" option.
> >
> > Also, I added the "-s" option to accept the gdb connection.
> >
> > After Linux boot up I connect with gdb and set a breakpoint in some
> > function, for example "device_del", does not matter really.
> >
> > The problem is if I also use "--enable-kvm", then after breakpoint
> > triggered and sending "n" from gdb, it switches to
> >
> > __sysvec_apic_timer_interrupt (regs=0xffffc90000297de8) at
> > arch/x86/kernel/apic/apic.c:1102
> > 1102 trace_local_timer_entry(LOCAL_TIMER_VECTOR);
> >
> > or to
> >
> > __default_send_IPI_dest_field (mask=<optimized out>,
> > vector=<optimized out>, dest=dest@entry=2048) at
> > arch/x86/kernel/apic/ipi.c:161
> > 161 cfg = __prepare_ICR2(mask);
> >
> > I am stepping over kernel code that does not perform any waiting or blocking.
> >
> > Everything works fine with "--enable-kvm" removed.
>
> I recently fixed that, and the code AFAIK is upstream, but probably, the qemu
> side of it didn't yet made it to the release.
>
> The problem you are seeing is that every time you single step, an interrupt
> occures because you are not as fast as computer is - timer interrupt happens
> like 1000 times in a second, so after each single step you do it will be pending.
>
> That makes GDB land you in the interrupt handler, which is correct
> technically but makes single stepping pretty much impossible.
>
> The solution is to tell kernel to mask interrupts regardless
> if they are masked by the guest, something that qemu even does when TCCG
> is used but was not implemented for KVM.
>
> Best regards,
> Maxim Levitsky
>
> PS: you might also want to patch kernel's lx-symbols gdb script to fix loadable module support,
> which currently doesn't work well - I run out of time to upstream it, I'll get to it
> someday.
>
> There problem here is that kernel's gdb script uses a breakpoint in the function that
> loads modules and when it hits, it reloads gdb symbols - that is frowned upon in gdb docs,
> but pretty much the only way to do it.
>
> I patched the lx-symbols script to at least work with recent gdb, but this no doubt relies on at least some undefined
> behavier in gdb, therefore I didn't push this futher.
>
> https://patchwork.kernel.org/project/kvm/patch/[email protected]/
>
>
>
> >
> > Thanks,
> > Vasyl
> >
>
>