Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp4364897img; Tue, 26 Mar 2019 08:03:58 -0700 (PDT) X-Google-Smtp-Source: APXvYqzMXTPhbAc4t1cpoHL5og9ioWjM50vl8uAKq7mdxjmyipNSxuxU5pXDtf+RBu6wEmU0mXxa X-Received: by 2002:a17:902:7289:: with SMTP id d9mr31286585pll.314.1553612638654; Tue, 26 Mar 2019 08:03:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553612638; cv=none; d=google.com; s=arc-20160816; b=JXVVkONbUPbeegjecOajPmJHyy9mQuw03lJSONn4R36Dv8KQyHDxC28a2PxTvS1g9F y461tiKGLJMF46jNONdgKt1PQeYEEDC1oXeMa/93RasPdLDD99WnF++J450AR1Ox7oev KthgFeJWVBiQGfx4yp7Ii22FPw0Rm2zp83UNpL/RWy3rfHW3oLL58rtfS305OrqCmYfU Ds/jpoLwnmCmnwVrkndpAognMs7n0Wi7OlkJXECDpOwfc0YPh0AhHy8qA9QIev4Lfp2a Id0xKs6/B0Yx5aKsUgVOQTjaDeV4Ss/+Z1sQfsQofYZooZClyCUFGZnrzqMHOZMfmn1z drUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=TAR6lj1pmWecWAX0kQ9QEEEWKqV4yHmB5d5OtEQrFz8=; b=z3QFpZeCjwTByWjQBpX+GApJFPFm0me+bySfouwS8oVvPsEbATjj3rA7tGJ21iw7MR twnsXxu6KEZrgobXEithe0Vtnege0NFjJTNMAUPbar9O3lHSpfrJhePwplrcxUZEGK8b il7lCSk5Umwk+ec0h0aUtNaou8UGU4GVzl3Tn2+6UpwCJjDLX0GK0P3c9bN1QyIx6deZ RpGCkFKJbIdvL3CGT7VEqSGM5rW7pqtCfzgqQHeFo1Cs5BIbVIkDIyiGWXu72JRoCnYp 8PKFkP1SXTZG7Ui2kx7DSn+/CPzHoEg/dM0XkKp6LN+hWQfqu7Dn3I/jB4lJjCSykeQn IkFA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=geksEQ3B; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 37si16920345plc.233.2019.03.26.08.03.40; Tue, 26 Mar 2019 08:03:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=geksEQ3B; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731715AbfCZPCw (ORCPT + 99 others); Tue, 26 Mar 2019 11:02:52 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:45980 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726111AbfCZPCv (ORCPT ); Tue, 26 Mar 2019 11:02:51 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x2QEx0wV109342; Tue, 26 Mar 2019 15:02:33 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=content-type : mime-version : subject : from : in-reply-to : date : cc : content-transfer-encoding : message-id : references : to; s=corp-2018-07-02; bh=TAR6lj1pmWecWAX0kQ9QEEEWKqV4yHmB5d5OtEQrFz8=; b=geksEQ3Bf26Xdg+u7uhi9Bm0s+KIazhk5mzxkz8seal/U8UYXzEWlC7BZ825uLWEqcZR n80qPy1dwpvcPgFoo6mx7D1Sn6pKdb9i8PSMkumA1GGL+LtiVqSHatZtDXxsLHWGFDxV iP8yfyn96wJzVJSTzIvkxoES99I7C4SxCKG56Mh1q+xU6oLmeTNmuqBHihikmk0icuMa N8BsG7OCWcBWKap2NWuSrl2QAPGxJBSUD0RecOW6n1XG9v0OcZjxnp4UwyWtHqjpXSKR vP5KdRAjVpaBDgPRu2LGKbn3E3NS5mGUTA68NOy7dPOclwPdYh9/SsMrcy8WHqjF/xie KA== Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by userp2120.oracle.com with ESMTP id 2re6djb2fp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 26 Mar 2019 15:02:33 +0000 Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id x2QF2WrJ022477 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 26 Mar 2019 15:02:32 GMT Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x2QF2W2v032601; Tue, 26 Mar 2019 15:02:32 GMT Received: from [10.0.5.57] (/213.57.127.10) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 26 Mar 2019 08:02:31 -0700 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 11.1 \(3445.4.7\)) Subject: Re: [PATCH] KVM: x86: nVMX: allow RSM to restore VMXE CR4 flag From: Liran Alon In-Reply-To: <87k1glagqj.fsf@vitty.brq.redhat.com> Date: Tue, 26 Mar 2019 17:02:27 +0200 Cc: kvm@vger.kernel.org, Paolo Bonzini , =?utf-8?B?UmFkaW0gS3LEjW3DocWZ?= , Jon Doron , Sean Christopherson , linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Message-Id: <06E50BD4-B3AC-4DBB-B700-80C30F2DC8BB@oracle.com> References: <20190326130746.28748-1-vkuznets@redhat.com> <87k1glagqj.fsf@vitty.brq.redhat.com> To: Vitaly Kuznetsov X-Mailer: Apple Mail (2.3445.4.7) X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9206 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 lowpriorityscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1903260105 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On 26 Mar 2019, at 15:48, Vitaly Kuznetsov = wrote: >=20 > Liran Alon writes: >=20 >>> On 26 Mar 2019, at 15:07, Vitaly Kuznetsov = wrote: >>> - Instread of putting the temporary HF_SMM_MASK drop to >>> rsm_enter_protected_mode() (as was suggested by Liran), move it to >>> emulator_set_cr() modifying its interface. emulate.c seems to be >>> vcpu-specifics-free at this moment, we may want to keep it this way. >>> - It seems that Hyper-V+UEFI on KVM is still broken, I'm observing = sporadic >>> hangs even with this patch. These hangs, however, seem to be = unrelated to >>> rsm. >>=20 >> Feel free to share details on these hangs ;) >>=20 >=20 > You've asked for it) >=20 > The immediate issue I'm observing is some sort of a lockup which is = easy > to trigger with e.g. "-usb -device usb-tablet" on Qemu command line; = it > seems we get too many interrupts and combined with preemtion timer for > L2 we're not making any progress: >=20 > kvm_userspace_exit: reason KVM_EXIT_IOAPIC_EOI (26) > kvm_set_irq: gsi 18 level 1 source 0 > kvm_msi_set_irq: dst 0 vec 177 (Fixed|physical|level) > kvm_apic_accept_irq: apicid 0 vec 177 (Fixed|edge) > kvm_fpu: load > kvm_entry: vcpu 0 > kvm_exit: reason VMRESUME rip 0xfffff80000848115 info 0 0 > kvm_entry: vcpu 0 > kvm_exit: reason PREEMPTION_TIMER rip 0xfffff800f4448e01 = info 0 0 > kvm_nested_vmexit: rip fffff800f4448e01 reason PREEMPTION_TIMER = info1 0 info2 0 int_info 0 int_info_err 0 > kvm_nested_vmexit_inject: reason EXTERNAL_INTERRUPT info1 0 info2 0 = int_info 800000b1 int_info_err 0 > kvm_entry: vcpu 0 > kvm_exit: reason APIC_ACCESS rip 0xfffff8000081fe11 info = 10b0 0 > kvm_apic: apic_write APIC_EOI =3D 0x0 > kvm_eoi: apicid 0 vector 177 > kvm_fpu: unload > kvm_userspace_exit: reason KVM_EXIT_IOAPIC_EOI (26) > ... > (and the pattern repeats) >=20 > Maybe it is a usb-only/Qemu-only problem, maybe not. >=20 > --=20 > Vitaly The trace of kvm_apic_accept_irq should indicate that = __apic_accept_irq() was called to inject an interrupt to L1 guest. (I know that now we are running in L1 because next exit is a VMRESUME). However, it is surprising to see that on next entry to guest, no = interrupt was injected by vmx_inject_irq(). It may be because L1 guest is currently running with interrupt disabled = and therefore only an IRQ-window was requested. (Too bad we don=E2=80=99t have a trace for this=E2=80=A6) Next, we got an exit from L1 guest on VMRESUME. As part of it=E2=80=99s = handling, active VMCS was changed from vmcs01 to vmcs02. I believe the immediate exit later on preemption-timer was because the = immediate-exit-request mechanism was invoked which is now implemented by setting a VMX preemption-timer with value of = 0 (Thanks to Sean). (See vmx_vcpu_run() -> vmx_update_hv_timer() -> vmx_arm_hv_timer(vmx, = 0)). (Note that the pending interrupt was evaluated because of a recent patch = of mine to nested_vmx_enter_non_root_mode() to request KVM_REQ_EVENT when vmcs01 have requested an IRQ-window) Therefore when entering L2, you immediately get an exit on = PREEMPTION_TIMER which will cause eventually L0 to call vmx_check_nested_events() which notices now the pending interrupt that = should have been injected before to L1 and now exit from L2 to L1 on EXTERNAL_INTERRUPT on vector 0xb1. Then L1 handles the interrupt by performing an EOI to LAPIC which = propagate an EOI to IOAPIC which immediately re-inject the interrupt (after clearing the remote_irr) as the irq-line is still = set. i.e. QEMU=E2=80=99s ioapic_eoi_broadcast() calls ioapic_service() = immediate after it clears remote-irr for this pin. Also note that in trace we see only a single kvm_set_irq to level 1 but = we don=E2=80=99t see immediately another kvm_set_irq to level 0. This should indicate that in QEMU=E2=80=99s IOAPIC redirection-table, = this pin is configured as level-triggered interrupt. However, the trace of kvm_apic_accept_irq indicates that this interrupt = is raised as an edge-triggered interrupt. To sum up: 1) I would create a patch to add a trace to vcpu_enter_guest() when = calling enable_smi_window() / enable_nmi_window() / enable_irq_window(). 2) It is worth investigating why MSI trigger-mode is edge-triggered = instead of level-triggered. 3) If this is indeed a level-triggered interrupt, it is worth = investigating how the interrupt source behaves. i.e. What cause this = device to lower the irq-line? (As we don=E2=80=99t see any I/O Port or MMIO access by L1 guest = interrupt-handler before performing the EOI) 4) Does this issue reproduce also when running with kernel-irqchip? = (Instead of split-irqchip) -Liran