2021-06-07 18:00:31

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH] KVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint message

Use the __string() machinery provided by the tracing subystem to make a
copy of the string literals consumed by the "nested VM-Enter failed"
tracepoint. A complete copy is necessary to ensure that the tracepoint
can't outlive the data/memory it consumes and deference stale memory.

Because the tracepoint itself is defined by kvm, if kvm-intel and/or
kvm-amd are built as modules, the memory holding the string literals
defined by the vendor modules will be freed when the module is unloaded,
whereas the tracepoint and its data in the ring buffer will live until
kvm is unloaded (or "indefinitely" if kvm is built-in).

This bug has existed since the tracepoint was added, but was recently
exposed by a new check in tracing to detect exactly this type of bug.

fmt: '%s%s
' current_buffer: ' vmx_dirty_log_t-140127 [003] .... kvm_nested_vmenter_failed: '
WARNING: CPU: 3 PID: 140134 at kernel/trace/trace.c:3759 trace_check_vprintf+0x3be/0x3e0
CPU: 3 PID: 140134 Comm: less Not tainted 5.13.0-rc1-ce2e73ce600a-req #184
Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014
RIP: 0010:trace_check_vprintf+0x3be/0x3e0
Code: <0f> 0b 44 8b 4c 24 1c e9 a9 fe ff ff c6 44 02 ff 00 49 8b 97 b0 20
RSP: 0018:ffffa895cc37bcb0 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffffa895cc37bd08 RCX: 0000000000000027
RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff9766cfad74f8
RBP: ffffffffc0a041d4 R08: ffff9766cfad74f0 R09: ffffa895cc37bad8
R10: 0000000000000001 R11: 0000000000000001 R12: ffffffffc0a041d4
R13: ffffffffc0f4dba8 R14: 0000000000000000 R15: ffff976409f2c000
FS: 00007f92fa200740(0000) GS:ffff9766cfac0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000559bd11b0000 CR3: 000000019fbaa002 CR4: 00000000001726e0
Call Trace:
trace_event_printf+0x5e/0x80
trace_raw_output_kvm_nested_vmenter_failed+0x3a/0x60 [kvm]
print_trace_line+0x1dd/0x4e0
s_show+0x45/0x150
seq_read_iter+0x2d5/0x4c0
seq_read+0x106/0x150
vfs_read+0x98/0x180
ksys_read+0x5f/0xe0
do_syscall_64+0x40/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae

Cc: Steven Rostedt <[email protected]>
Fixes: 380e0055bc7e ("KVM: nVMX: trace nested VM-Enter failures detected by H/W")
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/trace.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index a61c015870e3..4f839148948b 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -1550,16 +1550,16 @@ TRACE_EVENT(kvm_nested_vmenter_failed,
TP_ARGS(msg, err),

TP_STRUCT__entry(
- __field(const char *, msg)
+ __string(msg, msg)
__field(u32, err)
),

TP_fast_assign(
- __entry->msg = msg;
+ __assign_str(msg, msg);
__entry->err = err;
),

- TP_printk("%s%s", __entry->msg, !__entry->err ? "" :
+ TP_printk("%s%s", __get_str(msg), !__entry->err ? "" :
__print_symbolic(__entry->err, VMX_VMENTER_INSTRUCTION_ERRORS))
);

--
2.32.0.rc1.229.g3e70b5a671-goog


2021-06-07 20:41:54

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint message

On Mon, 7 Jun 2021 10:57:48 -0700
Sean Christopherson <[email protected]> wrote:

> Use the __string() machinery provided by the tracing subystem to make a
> copy of the string literals consumed by the "nested VM-Enter failed"
> tracepoint. A complete copy is necessary to ensure that the tracepoint
> can't outlive the data/memory it consumes and deference stale memory.
>
> Because the tracepoint itself is defined by kvm, if kvm-intel and/or
> kvm-amd are built as modules, the memory holding the string literals
> defined by the vendor modules will be freed when the module is unloaded,
> whereas the tracepoint and its data in the ring buffer will live until
> kvm is unloaded (or "indefinitely" if kvm is built-in).
>
> This bug has existed since the tracepoint was added, but was recently
> exposed by a new check in tracing to detect exactly this type of bug.
>
> fmt: '%s%s
> ' current_buffer: ' vmx_dirty_log_t-140127 [003] .... kvm_nested_vmenter_failed: '
> WARNING: CPU: 3 PID: 140134 at kernel/trace/trace.c:3759 trace_check_vprintf+0x3be/0x3e0
> CPU: 3 PID: 140134 Comm: less Not tainted 5.13.0-rc1-ce2e73ce600a-req #184
> Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014
> RIP: 0010:trace_check_vprintf+0x3be/0x3e0
> Code: <0f> 0b 44 8b 4c 24 1c e9 a9 fe ff ff c6 44 02 ff 00 49 8b 97 b0 20
> RSP: 0018:ffffa895cc37bcb0 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: ffffa895cc37bd08 RCX: 0000000000000027
> RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff9766cfad74f8
> RBP: ffffffffc0a041d4 R08: ffff9766cfad74f0 R09: ffffa895cc37bad8
> R10: 0000000000000001 R11: 0000000000000001 R12: ffffffffc0a041d4
> R13: ffffffffc0f4dba8 R14: 0000000000000000 R15: ffff976409f2c000
> FS: 00007f92fa200740(0000) GS:ffff9766cfac0000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000559bd11b0000 CR3: 000000019fbaa002 CR4: 00000000001726e0
> Call Trace:
> trace_event_printf+0x5e/0x80
> trace_raw_output_kvm_nested_vmenter_failed+0x3a/0x60 [kvm]
> print_trace_line+0x1dd/0x4e0
> s_show+0x45/0x150
> seq_read_iter+0x2d5/0x4c0
> seq_read+0x106/0x150
> vfs_read+0x98/0x180
> ksys_read+0x5f/0xe0
> do_syscall_64+0x40/0xb0
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> Cc: Steven Rostedt <[email protected]>
> Fixes: 380e0055bc7e ("KVM: nVMX: trace nested VM-Enter failures detected by H/W")
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
>

Reviewed-by: Steven Rostedt (VMware) <[email protected]>

-- Steve

2021-06-09 08:52:04

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint message

On 07/06/21 20:48, Steven Rostedt wrote:
> On Mon, 7 Jun 2021 10:57:48 -0700
> Sean Christopherson <[email protected]> wrote:
>
>> Use the __string() machinery provided by the tracing subystem to make a
>> copy of the string literals consumed by the "nested VM-Enter failed"
>> tracepoint. A complete copy is necessary to ensure that the tracepoint
>> can't outlive the data/memory it consumes and deference stale memory.
>>
>> Because the tracepoint itself is defined by kvm, if kvm-intel and/or
>> kvm-amd are built as modules, the memory holding the string literals
>> defined by the vendor modules will be freed when the module is unloaded,
>> whereas the tracepoint and its data in the ring buffer will live until
>> kvm is unloaded (or "indefinitely" if kvm is built-in).
>>
>> This bug has existed since the tracepoint was added, but was recently
>> exposed by a new check in tracing to detect exactly this type of bug.
>>
>> fmt: '%s%s
>> ' current_buffer: ' vmx_dirty_log_t-140127 [003] .... kvm_nested_vmenter_failed: '
>> WARNING: CPU: 3 PID: 140134 at kernel/trace/trace.c:3759 trace_check_vprintf+0x3be/0x3e0
>> CPU: 3 PID: 140134 Comm: less Not tainted 5.13.0-rc1-ce2e73ce600a-req #184
>> Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014
>> RIP: 0010:trace_check_vprintf+0x3be/0x3e0
>> Code: <0f> 0b 44 8b 4c 24 1c e9 a9 fe ff ff c6 44 02 ff 00 49 8b 97 b0 20
>> RSP: 0018:ffffa895cc37bcb0 EFLAGS: 00010282
>> RAX: 0000000000000000 RBX: ffffa895cc37bd08 RCX: 0000000000000027
>> RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff9766cfad74f8
>> RBP: ffffffffc0a041d4 R08: ffff9766cfad74f0 R09: ffffa895cc37bad8
>> R10: 0000000000000001 R11: 0000000000000001 R12: ffffffffc0a041d4
>> R13: ffffffffc0f4dba8 R14: 0000000000000000 R15: ffff976409f2c000
>> FS: 00007f92fa200740(0000) GS:ffff9766cfac0000(0000) knlGS:0000000000000000
>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000559bd11b0000 CR3: 000000019fbaa002 CR4: 00000000001726e0
>> Call Trace:
>> trace_event_printf+0x5e/0x80
>> trace_raw_output_kvm_nested_vmenter_failed+0x3a/0x60 [kvm]
>> print_trace_line+0x1dd/0x4e0
>> s_show+0x45/0x150
>> seq_read_iter+0x2d5/0x4c0
>> seq_read+0x106/0x150
>> vfs_read+0x98/0x180
>> ksys_read+0x5f/0xe0
>> do_syscall_64+0x40/0xb0
>> entry_SYSCALL_64_after_hwframe+0x44/0xae
>>
>> Cc: Steven Rostedt <[email protected]>
>> Fixes: 380e0055bc7e ("KVM: nVMX: trace nested VM-Enter failures detected by H/W")
>> Signed-off-by: Sean Christopherson <[email protected]>
>> ---
>>
>
> Reviewed-by: Steven Rostedt (VMware) <[email protected]>
>
> -- Steve
>

Queued, thanks.

Paolo

2021-06-09 16:38:36

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint message

On Wed, Jun 09, 2021, Wanpeng Li wrote:
> On Tue, 8 Jun 2021 at 02:00, Sean Christopherson <[email protected]> wrote:
> >
> > Use the __string() machinery provided by the tracing subystem to make a
> > copy of the string literals consumed by the "nested VM-Enter failed"
> > tracepoint. A complete copy is necessary to ensure that the tracepoint
> > can't outlive the data/memory it consumes and deference stale memory.
> >
> > Because the tracepoint itself is defined by kvm, if kvm-intel and/or
> > kvm-amd are built as modules, the memory holding the string literals
> > defined by the vendor modules will be freed when the module is unloaded,
> > whereas the tracepoint and its data in the ring buffer will live until
> > kvm is unloaded (or "indefinitely" if kvm is built-in).
> >
> > This bug has existed since the tracepoint was added, but was recently
> > exposed by a new check in tracing to detect exactly this type of bug.
> >
> > fmt: '%s%s
> > ' current_buffer: ' vmx_dirty_log_t-140127 [003] .... kvm_nested_vmenter_failed: '
> > WARNING: CPU: 3 PID: 140134 at kernel/trace/trace.c:3759 trace_check_vprintf+0x3be/0x3e0
> > CPU: 3 PID: 140134 Comm: less Not tainted 5.13.0-rc1-ce2e73ce600a-req #184
> > Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014
> > RIP: 0010:trace_check_vprintf+0x3be/0x3e0
> > Code: <0f> 0b 44 8b 4c 24 1c e9 a9 fe ff ff c6 44 02 ff 00 49 8b 97 b0 20
> > RSP: 0018:ffffa895cc37bcb0 EFLAGS: 00010282
> > RAX: 0000000000000000 RBX: ffffa895cc37bd08 RCX: 0000000000000027
> > RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff9766cfad74f8
> > RBP: ffffffffc0a041d4 R08: ffff9766cfad74f0 R09: ffffa895cc37bad8
> > R10: 0000000000000001 R11: 0000000000000001 R12: ffffffffc0a041d4
> > R13: ffffffffc0f4dba8 R14: 0000000000000000 R15: ffff976409f2c000
> > FS: 00007f92fa200740(0000) GS:ffff9766cfac0000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000559bd11b0000 CR3: 000000019fbaa002 CR4: 00000000001726e0
> > Call Trace:
> > trace_event_printf+0x5e/0x80
> > trace_raw_output_kvm_nested_vmenter_failed+0x3a/0x60 [kvm]
> > print_trace_line+0x1dd/0x4e0
> > s_show+0x45/0x150
> > seq_read_iter+0x2d5/0x4c0
> > seq_read+0x106/0x150
> > vfs_read+0x98/0x180
> > ksys_read+0x5f/0xe0
> > do_syscall_64+0x40/0xb0
> > entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> I can observe tons of other kvm tracepoints warning like this after
> commit 9a6944fee68e25 (tracing: Add a verifier to check string
> pointers for trace events), just echo 1 >
> /sys/kernel/tracing/events/kvm/enable and boot a linux guest.

Can you provide your .config? With all of events/kvm and events/kvmmmu enabled
I don't get any warnings running a Linux guest, a nested Linux guest, and
kvm-unit-tests.

Do you see the behavior with other tracepoints? E.g. enabling all events on my
systems yields warnings for a USB module, but everything else is clean.

2021-06-09 17:07:44

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint message

On Tue, 8 Jun 2021 at 02:00, Sean Christopherson <[email protected]> wrote:
>
> Use the __string() machinery provided by the tracing subystem to make a
> copy of the string literals consumed by the "nested VM-Enter failed"
> tracepoint. A complete copy is necessary to ensure that the tracepoint
> can't outlive the data/memory it consumes and deference stale memory.
>
> Because the tracepoint itself is defined by kvm, if kvm-intel and/or
> kvm-amd are built as modules, the memory holding the string literals
> defined by the vendor modules will be freed when the module is unloaded,
> whereas the tracepoint and its data in the ring buffer will live until
> kvm is unloaded (or "indefinitely" if kvm is built-in).
>
> This bug has existed since the tracepoint was added, but was recently
> exposed by a new check in tracing to detect exactly this type of bug.
>
> fmt: '%s%s
> ' current_buffer: ' vmx_dirty_log_t-140127 [003] .... kvm_nested_vmenter_failed: '
> WARNING: CPU: 3 PID: 140134 at kernel/trace/trace.c:3759 trace_check_vprintf+0x3be/0x3e0
> CPU: 3 PID: 140134 Comm: less Not tainted 5.13.0-rc1-ce2e73ce600a-req #184
> Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014
> RIP: 0010:trace_check_vprintf+0x3be/0x3e0
> Code: <0f> 0b 44 8b 4c 24 1c e9 a9 fe ff ff c6 44 02 ff 00 49 8b 97 b0 20
> RSP: 0018:ffffa895cc37bcb0 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: ffffa895cc37bd08 RCX: 0000000000000027
> RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff9766cfad74f8
> RBP: ffffffffc0a041d4 R08: ffff9766cfad74f0 R09: ffffa895cc37bad8
> R10: 0000000000000001 R11: 0000000000000001 R12: ffffffffc0a041d4
> R13: ffffffffc0f4dba8 R14: 0000000000000000 R15: ffff976409f2c000
> FS: 00007f92fa200740(0000) GS:ffff9766cfac0000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000559bd11b0000 CR3: 000000019fbaa002 CR4: 00000000001726e0
> Call Trace:
> trace_event_printf+0x5e/0x80
> trace_raw_output_kvm_nested_vmenter_failed+0x3a/0x60 [kvm]
> print_trace_line+0x1dd/0x4e0
> s_show+0x45/0x150
> seq_read_iter+0x2d5/0x4c0
> seq_read+0x106/0x150
> vfs_read+0x98/0x180
> ksys_read+0x5f/0xe0
> do_syscall_64+0x40/0xb0
> entry_SYSCALL_64_after_hwframe+0x44/0xae

I can observe tons of other kvm tracepoints warning like this after
commit 9a6944fee68e25 (tracing: Add a verifier to check string
pointers for trace events), just echo 1 >
/sys/kernel/tracing/events/kvm/enable and boot a linux guest.

Wanpeng

2021-06-09 18:10:17

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint message

On Wed, 9 Jun 2021 09:12:40 -0700
Sean Christopherson <[email protected]> wrote:

> > I can observe tons of other kvm tracepoints warning like this after
> > commit 9a6944fee68e25 (tracing: Add a verifier to check string
> > pointers for trace events), just echo 1 >
> > /sys/kernel/tracing/events/kvm/enable and boot a linux guest.
>
> Can you provide your .config? With all of events/kvm and events/kvmmmu enabled
> I don't get any warnings running a Linux guest, a nested Linux guest, and
> kvm-unit-tests.
>
> Do you see the behavior with other tracepoints? E.g. enabling all events on my
> systems yields warnings for a USB module, but everything else is clean.

Right, I tested this with running KVM guests when it was added. I was
actually surprised with Sean's report saying it was a kvm trace event,
until I noticed that the problem event is only triggered on error paths.

-- Steve

2021-06-10 12:13:37

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Ensure liveliness of nested VM-Enter fail tracepoint message

On 07/06/21 19:57, Sean Christopherson wrote:
> Use the __string() machinery provided by the tracing subystem to make a
> copy of the string literals consumed by the "nested VM-Enter failed"
> tracepoint. A complete copy is necessary to ensure that the tracepoint
> can't outlive the data/memory it consumes and deference stale memory.
>
> Because the tracepoint itself is defined by kvm, if kvm-intel and/or
> kvm-amd are built as modules, the memory holding the string literals
> defined by the vendor modules will be freed when the module is unloaded,
> whereas the tracepoint and its data in the ring buffer will live until
> kvm is unloaded (or "indefinitely" if kvm is built-in).
>
> This bug has existed since the tracepoint was added, but was recently
> exposed by a new check in tracing to detect exactly this type of bug.
>
> fmt: '%s%s
> ' current_buffer: ' vmx_dirty_log_t-140127 [003] .... kvm_nested_vmenter_failed: '
> WARNING: CPU: 3 PID: 140134 at kernel/trace/trace.c:3759 trace_check_vprintf+0x3be/0x3e0
> CPU: 3 PID: 140134 Comm: less Not tainted 5.13.0-rc1-ce2e73ce600a-req #184
> Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014
> RIP: 0010:trace_check_vprintf+0x3be/0x3e0
> Code: <0f> 0b 44 8b 4c 24 1c e9 a9 fe ff ff c6 44 02 ff 00 49 8b 97 b0 20
> RSP: 0018:ffffa895cc37bcb0 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: ffffa895cc37bd08 RCX: 0000000000000027
> RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff9766cfad74f8
> RBP: ffffffffc0a041d4 R08: ffff9766cfad74f0 R09: ffffa895cc37bad8
> R10: 0000000000000001 R11: 0000000000000001 R12: ffffffffc0a041d4
> R13: ffffffffc0f4dba8 R14: 0000000000000000 R15: ffff976409f2c000
> FS: 00007f92fa200740(0000) GS:ffff9766cfac0000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000559bd11b0000 CR3: 000000019fbaa002 CR4: 00000000001726e0
> Call Trace:
> trace_event_printf+0x5e/0x80
> trace_raw_output_kvm_nested_vmenter_failed+0x3a/0x60 [kvm]
> print_trace_line+0x1dd/0x4e0
> s_show+0x45/0x150
> seq_read_iter+0x2d5/0x4c0
> seq_read+0x106/0x150
> vfs_read+0x98/0x180
> ksys_read+0x5f/0xe0
> do_syscall_64+0x40/0xb0
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> Cc: Steven Rostedt <[email protected]>
> Fixes: 380e0055bc7e ("KVM: nVMX: trace nested VM-Enter failures detected by H/W")
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> arch/x86/kvm/trace.h | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
> index a61c015870e3..4f839148948b 100644
> --- a/arch/x86/kvm/trace.h
> +++ b/arch/x86/kvm/trace.h
> @@ -1550,16 +1550,16 @@ TRACE_EVENT(kvm_nested_vmenter_failed,
> TP_ARGS(msg, err),
>
> TP_STRUCT__entry(
> - __field(const char *, msg)
> + __string(msg, msg)
> __field(u32, err)
> ),
>
> TP_fast_assign(
> - __entry->msg = msg;
> + __assign_str(msg, msg);
> __entry->err = err;
> ),
>
> - TP_printk("%s%s", __entry->msg, !__entry->err ? "" :
> + TP_printk("%s%s", __get_str(msg), !__entry->err ? "" :
> __print_symbolic(__entry->err, VMX_VMENTER_INSTRUCTION_ERRORS))
> );
>
>

Queued, thanks.

Paolo