Received: by 10.223.164.202 with SMTP id h10csp922597wrb; Thu, 23 Nov 2017 08:19:20 -0800 (PST) X-Google-Smtp-Source: AGs4zMaTICiF6AUOEGDJWbX9Aas09ffoB2PyEjuuv89QPg218Jzy+rTtMunPrnsoMBDEKE5TSS7/ X-Received: by 10.99.133.200 with SMTP id u191mr24791349pgd.327.1511453960406; Thu, 23 Nov 2017 08:19:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1511453960; cv=none; d=google.com; s=arc-20160816; b=IWbgEYyvXFYxL3bzCJ88CubNIJkWKkG4EoXy2pm5lnROWy6ZIfN7VA/ym1X7il6yeX zqhnQ3Sfz6G3Liah7xOIpffkJUt4yFiuzKwJ+nFuBWSRkPwnhUS/M2ezjothYGNDYWkX aeOn2L+Tm5oo3HdmsfsfLZE7t1FeDmbZ7Tv46FMkPQIVUX27QaOF1sMKDyDXeRckbxp2 +SBXJink7tvjOA/0drUU1l+6P3ooOurQDgyMvHMlZGJV3gxSYp/pvM7VCMgw/J/V7oBk Ehdl/0+FAA+LuBjUJuqfzQdYvWYT+3YAACOC60zdHOQohvXmvphtj6hiCu8IL6NOQpTc huPQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :references:subject:cc:to:mime-version:user-agent:from:date :message-id:arc-authentication-results; bh=FVpWtmYnbloShXBRaBT0NiVOoUQ6zkDDtuOD0dbMQR0=; b=lOn1fW/rmHQPJOdLErv0+s8F3UBpaGJW4GzPWq2qxLlpeYmIBEbxmUPnyL/J9WzVDC kCH40uuYgHdCb59+/kacri6dgLdb3LasNm1SaI+Z6uoLjgLNwegl3vDqRP9iOnh9dtyV 7nsSyZmpFQvhr1C+4lAT+zt6X7VW7zKEqpcv21xkZv9csIJinx+Weowvhn5xcgKNhLRN hIEGYcDFbFkSDdzHw+EksaaO7V2nQLBZSrrsTHV0gM30TPorElMYUFb20fawDWoyfl2S /LSkCVzXs8hS3DJroFLO5/hR9xRNYKbwWVYwrWFrPlvSZN9BHjDq0MOs9Q320zzsTcD9 dzjA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x9si16152524pgr.404.2017.11.23.08.19.09; Thu, 23 Nov 2017 08:19:20 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753072AbdKWQST (ORCPT + 76 others); Thu, 23 Nov 2017 11:18:19 -0500 Received: from userp1040.oracle.com ([156.151.31.81]:50772 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751731AbdKWQSR (ORCPT ); Thu, 23 Nov 2017 11:18:17 -0500 Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id vANGIC0l013131 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 23 Nov 2017 16:18:12 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id vANGIBu4031686 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 23 Nov 2017 16:18:11 GMT Received: from abhmp0007.oracle.com (abhmp0007.oracle.com [141.146.116.13]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id vANGIAjV022068; Thu, 23 Nov 2017 16:18:10 GMT Received: from [10.0.2.126] (/213.57.127.2) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 23 Nov 2017 08:18:10 -0800 Message-ID: <5A16F4BF.9060306@ORACLE.COM> Date: Thu, 23 Nov 2017 18:18:07 +0200 From: Liran Alon User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , Marc Haber CC: LKML , "KVM-ML (kvm@vger.kernel.org)" , Wanpeng Li Subject: Re: VMs freezing when host is running 4.14 References: <20171121161821.b6k3hdl3wgia5f5q@torres.zugschlus.de> <20171122093945.5afa2di2g7qhf4eb@torres.zugschlus.de> <20171122155208.wdcmosxfpsjbwcrm@torres.zugschlus.de> <20171122164312.GA21279@flask> <20171123152024.7xsc7lesv2qyujng@torres.zugschlus.de> <20171123155946.GC21184@flask> In-Reply-To: <20171123155946.GC21184@flask> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Source-IP: userv0022.oracle.com [156.151.31.74] Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 23/11/17 17:59, Radim Krčmář wrote: > 2017-11-23 16:20+0100, Marc Haber: >> On Wed, Nov 22, 2017 at 05:43:13PM +0100, Radim Krčmář wrote: >>> 2017-11-22 16:52+0100, Marc Haber: >>>> On Wed, Nov 22, 2017 at 04:04:42PM +0100, 王金浦 wrote: >>>>> So all guest kernels are 4.14, or also other older kernel? >>>> >>>> Guest kernels are also 4.14, but the issue disappears when the host is >>>> downgraded to an older kernel. I therefore reckoned that the guest >>>> kernel doesn't matter, but that was before I saw the trace in the log. >>> >>> The two most suspicious patches since 4.13 (which I assume works) are >>> >>> 664f8e26b00c ("KVM: X86: Fix loss of exception which has not yet been >>> injected") >> >> That one does not revert cleanly, the line in questions seems to have >> been removed a bit later. >> >> Reject is: >> 141 [24/5001]mh@fan:~/linux/git/linux ((v4.14.1) %) $ cat arch/x86/kvm/vmx.c.rej--- arch/x86/kvm/vmx.c >> +++ arch/x86/kvm/vmx.c >> @@ -2516,7 +2516,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu) >> struct vcpu_vmx *vmx = to_vmx(vcpu); >> unsigned nr = vcpu->arch.exception.nr; >> bool has_error_code = vcpu->arch.exception.has_error_code; >> - bool reinject = vcpu->arch.exception.injected; >> + bool reinject = vcpu->arch.exception.reinject; >> u32 error_code = vcpu->arch.exception.error_code; >> u32 intr_info = nr | INTR_INFO_VALID_MASK; > > This line one can be deleted as reinject isn't used in the function. > > Btw. there have been already many fixes from Liran Alon for that patch > and your case could be the one adressed in > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_kvm_msg159158.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=Jk6Q8nNzkQ6LJ6g42qARkg6ryIDGQr-yKXPNGZbpTx0&m=206jU1rQdk3xs1DYWbQPz1gR7Iim02XOjwn458rwgIo&s=fz1JeZiSQBwqYpkmeX8OJukyC4M8BeXSuIOKwuVaeHg&e= > > The patch is incorrect, but you might be able to see only its benefits. Actually I would first attempt to check this patch of mine: https://www.spinics.net/lists/kvm/msg159062.html It fixes a bug of a L2 exception accidentally being delivered into L1. Regards, -Liran > >>> and >>> >>> 9a6e7c39810e ("KVM: async_pf: Fix #DF due to inject "Page not Present" >>> and "Page Ready" exceptions simultaneously") >>> >>> please try reverting them to see if it helps, >> >> That one reverted cleanly. I am now running the new kernel on the >> affected machine, and I think that a second machine has joined the >> market of being affected. > > That one had much lower chances of being the culprit. > >> Would this matter on the host only or on the guests as well? > > Only on the host. > > Thanks. > From 1584873173737222069@xxx Thu Nov 23 16:00:40 +0000 2017 X-GM-THRID: 1584693144330950355 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread