Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp4228025pxf; Tue, 16 Mar 2021 08:29:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz6rd9c8MMwSU5ZgpqUk6x5u5UystLMDHE+OsGJDDwokMl4ffmA/QGx2szmvPHEikEUu22p X-Received: by 2002:a05:6402:220a:: with SMTP id cq10mr36804692edb.345.1615908539900; Tue, 16 Mar 2021 08:28:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1615908539; cv=none; d=google.com; s=arc-20160816; b=QIO+xg15Y/H5Z76u0hwzRArB7upQa/6HuNAPBUNDdw7kUlt6nHoi4dxAprMNDslnu0 3xe2F/e2AOJ83IwWxfflJ29FboZ8PPjvHfmSpYUB1ELDe7fprFRCPaOmxzC9xYaq8qTJ xmz+4aQCGoJL51epYvNsDVsAsX87ZpibDjwt4jo23wF0gh9zedsdBb6yLvjD+LoSnFsi M+5/JtSeCPaBgKmuxa7OT7vNZVUAFwGXGbrPA1CKfOFY/ecxZ+OAT5ywabcwDxTmMjb4 FCYIcAEQwXea7X6qB3I+clpBw/mjC3jSmKyZ+aJYwpas/R8U/dcen7SAo8pTncHmMLQR q8Yw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=MUrSzI65k7H/2OLfiC0X6pkWi+7Dy5V74c3Lo4KknlY=; b=AhaY+YyOSnJE1LIJJDUt+WZlpY0d3nBRpLbuNnL1eM0JajiQea1/LgjVHqNibnWLh3 uu9UnQyarreMJLihG3WePepXOOhJbn5F+3vXAffXkjr1rFwulnLOmHMZ9oEl1rlnhhmD ykDMblas182qPsQH/r+9boR9dmgsKDzJZGnr8oJerjrgcJOMi/p80isy7NAktUF0T7p0 zUPxc7UNlNOJ3qxAWj3Lqdq46zkUV0/m8amevEy6hhJazztMDxGCWMFAehZVLVBReDeY frJhWNnFxAWsj2Kj0qBqW8oGIQxQPIZKz8e2A2geZQrAl1u1hLFZA6mO7p9QIx6JTRRW QT9g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=geVcPxfC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z21si13026980edr.51.2021.03.16.08.28.37; Tue, 16 Mar 2021 08:28:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=geVcPxfC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237023AbhCPLAJ (ORCPT + 99 others); Tue, 16 Mar 2021 07:00:09 -0400 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:54452 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232693AbhCPK7R (ORCPT ); Tue, 16 Mar 2021 06:59:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1615892357; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MUrSzI65k7H/2OLfiC0X6pkWi+7Dy5V74c3Lo4KknlY=; b=geVcPxfCry3HYbaWwpEGD/4zps5m6VzESUpmTsbreQXdFUYrsoZZ/Rz7qd1lgYjv2G6yQG XqqgQzmaU/J/FI3tlmDCeATYsfH3cSO0JHwHDvuRSnWq6NNaCNa4CefL5yKLISxqaOtc9x 1rqO6ZrmWg59nU3Dlm83PC8u93KjcDk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-372-dBph1urGMY6NK_a46hKiiQ-1; Tue, 16 Mar 2021 06:59:15 -0400 X-MC-Unique: dBph1urGMY6NK_a46hKiiQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 27444100C61A; Tue, 16 Mar 2021 10:59:13 +0000 (UTC) Received: from starship (unknown [10.35.207.30]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0946260C0F; Tue, 16 Mar 2021 10:59:08 +0000 (UTC) Message-ID: Subject: Re: [PATCH 2/3] KVM: x86: guest debug: don't inject interrupts while single stepping From: Maxim Levitsky To: Jan Kiszka , Sean Christopherson Cc: kvm@vger.kernel.org, Vitaly Kuznetsov , linux-kernel@vger.kernel.org, Thomas Gleixner , Wanpeng Li , Kieran Bingham , Jessica Yu , Andrew Morton , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , Joerg Roedel , Jim Mattson , Borislav Petkov , Stefano Garzarella , "H. Peter Anvin" , Paolo Bonzini , Ingo Molnar Date: Tue, 16 Mar 2021 12:59:07 +0200 In-Reply-To: <1259724f-1bdb-6229-2772-3192f6d17a4a@siemens.com> References: <20210315221020.661693-1-mlevitsk@redhat.com> <20210315221020.661693-3-mlevitsk@redhat.com> <1259724f-1bdb-6229-2772-3192f6d17a4a@siemens.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.36.5 (3.36.5-2.fc32) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2021-03-16 at 10:16 +0100, Jan Kiszka wrote: > On 16.03.21 00:37, Sean Christopherson wrote: > > On Tue, Mar 16, 2021, Maxim Levitsky wrote: > > > This change greatly helps with two issues: > > > > > > * Resuming from a breakpoint is much more reliable. > > > > > > When resuming execution from a breakpoint, with interrupts enabled, more often > > > than not, KVM would inject an interrupt and make the CPU jump immediately to > > > the interrupt handler and eventually return to the breakpoint, to trigger it > > > again. > > > > > > From the user point of view it looks like the CPU never executed a > > > single instruction and in some cases that can even prevent forward progress, > > > for example, when the breakpoint is placed by an automated script > > > (e.g lx-symbols), which does something in response to the breakpoint and then > > > continues the guest automatically. > > > If the script execution takes enough time for another interrupt to arrive, > > > the guest will be stuck on the same breakpoint RIP forever. > > > > > > * Normal single stepping is much more predictable, since it won't land the > > > debugger into an interrupt handler, so it is much more usable. > > > > > > (If entry to an interrupt handler is desired, the user can still place a > > > breakpoint at it and resume the guest, which won't activate this workaround > > > and let the gdb still stop at the interrupt handler) > > > > > > Since this change is only active when guest is debugged, it won't affect > > > KVM running normal 'production' VMs. > > > > > > > > > Signed-off-by: Maxim Levitsky > > > Tested-by: Stefano Garzarella > > > --- > > > arch/x86/kvm/x86.c | 6 ++++++ > > > 1 file changed, 6 insertions(+) > > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > index a9d95f90a0487..b75d990fcf12b 100644 > > > --- a/arch/x86/kvm/x86.c > > > +++ b/arch/x86/kvm/x86.c > > > @@ -8458,6 +8458,12 @@ static void inject_pending_event(struct kvm_vcpu *vcpu, bool *req_immediate_exit > > > can_inject = false; > > > } > > > > > > + /* > > > + * Don't inject interrupts while single stepping to make guest debug easier > > > + */ > > > + if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) > > > + return; > > > > Is this something userspace can deal with? E.g. disable IRQs and/or set NMI > > blocking at the start of single-stepping, unwind at the end? Deviating this far > > from architectural behavior will end in tears at some point. > > > > Does this happen to address this suspicious workaround in the kernel? > > /* > * The kernel doesn't use TF single-step outside of: > * > * - Kprobes, consumed through kprobe_debug_handler() > * - KGDB, consumed through notify_debug() > * > * So if we get here with DR_STEP set, something is wonky. > * > * A known way to trigger this is through QEMU's GDB stub, > * which leaks #DB into the guest and causes IST recursion. > */ > if (WARN_ON_ONCE(dr6 & DR_STEP)) > regs->flags &= ~X86_EFLAGS_TF; > > (arch/x86/kernel/traps.c, exc_debug_kernel) > > I wonder why this got merged while no one fixed QEMU/KVM, for years? Oh, > yeah, question to myself as well, dancing around broken guest debugging > for a long time while trying to fix other issues... To be honest I didn't see that warning even once, but I can imagine KVM leaking #DB due to bugs in that code. That area historically didn't receive much attention since it can only be triggered by KVM_GET/SET_GUEST_DEBUG which isn't used in production. The only issue that I on the other hand did see which is mostly gdb fault is that it fails to remove a software breakpoint when resuming over it, if that breakpoint's python handler messes up with gdb's symbols, which is what lx-symbols does. And that despite the fact that lx-symbol doesn't mess with the object (that is the kernel) where the breakpoint is defined. Just adding/removing one symbol file is enough to trigger this issue. Since lx-symbols already works this around when it reloads all symbols, I extended that workaround to happen also when loading/unloading only a single symbol file. Best regards, Maxim Levitsky > > Jan > > > > + > > > /* > > > * Finally, inject interrupt events. If an event cannot be injected > > > * due to architectural conditions (e.g. IF=0) a window-open exit > > > -- > > > 2.26.2 > > >