Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp3360219ybi; Fri, 19 Jul 2019 02:00:20 -0700 (PDT) X-Google-Smtp-Source: APXvYqzm9R3xH03F35g33OkYg0HcfUz40pm24PoiPlHzdC4mdXANBf1Y4I+YYaIL5oncvJ6QPI6D X-Received: by 2002:a17:90a:206a:: with SMTP id n97mr56674730pjc.10.1563526820827; Fri, 19 Jul 2019 02:00:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1563526820; cv=none; d=google.com; s=arc-20160816; b=b+vP+UULKyqA1LQBWNPG/7ldEs2pAl54K9RQlmjpJ64UJPpIpMwk5AKsFtZqQ9jkvS 5vCDuXDFL3bqgTU6wBPq9T4wgnN7QS9Tq9Ar055CeeEizKp3CKxPqRmVi4y94jFnd/Yx eFMPKuyp+djYIwasRKKBO24qLpyUB+t12t9fXC0HYjvwNcQU+lHrrc9nzB+tEHrJ5S30 GDlWzRfV1vyRWeW6ETAbb1GlUKDuGQXrLk+1szbkdMxnru7mgro/q6MlnsxR3HIwfcWN MBWxwf47zH30ASS/3GnmwJLZKz/TxY9CdwXpLCejHfqvTqpiGB67edLtExejuOYwm0Kj jxmA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=gBusBZhQcLKBWDanQAGbHMALYypzWJ4WXbnd8gqcE3w=; b=Rd6A86Uhdj/u28FzxQu8jfSGk2sjoz5jMf2va2BK4dW7VPeU3OR4IkdjzCmI7eF/AL GYRDLJSrIgbZvkUxxttOAMBNKB/UZJSdpM5uWB8KzNrr1bREoor4mAQtu7/Z8yI3znid Z45DQ7kS4wilL4eXpP434Y1Ccvl1CyjB1ZAkbX3lMozCoctURWAnrtCSf/b5ATPuDTRo tMTFkW62PamXbvNLqY4JpOXO3smN4T7pGvUEzVd+8L1jZrQ6PyvhyzlS7inoVId19ncF Qwz43/dLnVSj/O0fjoG/3WkXNfTsMtc8Bni66fqFL+iv65UEwnPmg5+7kJKPS2q8Kyz7 MxpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=hzQUCbzu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k13si743723pfa.17.2019.07.19.02.00.04; Fri, 19 Jul 2019 02:00:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=hzQUCbzu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726893AbfGSI7i (ORCPT + 99 others); Fri, 19 Jul 2019 04:59:38 -0400 Received: from mail-oi1-f196.google.com ([209.85.167.196]:46833 "EHLO mail-oi1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725794AbfGSI7h (ORCPT ); Fri, 19 Jul 2019 04:59:37 -0400 Received: by mail-oi1-f196.google.com with SMTP id 65so23702547oid.13; Fri, 19 Jul 2019 01:59:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=gBusBZhQcLKBWDanQAGbHMALYypzWJ4WXbnd8gqcE3w=; b=hzQUCbzumVp7OrDT0mxpxR3wMx4ptyZYpa6YUNaWZPMrRycx8xtDDqJXyYTTAGhRoQ mka1Z7qRSaSNvxu9QSd3RzxuZclNptcC6dYCbJBvRYk6RB/K9v7lKHYZ+dPy2t204zDG 8N8aV6KlPlWZoj607hSatkWfw5L2pTyTxUXmFOWPNhqGdLZYpnQpzAwIgvvAzLfrC1JI eqiZxe/3w3ZF+rYttmsr3JQhPutJmcdGhZrWWjFL8aTLYCN+dHTlx5MU4AN4XhQWFlJv snwD0vfxkiSuL+//2Wh85TxuU7E8dkMJIV4+a91zaiLdcz3S6JMMsYJh97qjyx0ciDHJ I21g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=gBusBZhQcLKBWDanQAGbHMALYypzWJ4WXbnd8gqcE3w=; b=WfLT+ga7JYV4BIEDmpGdqoHJleYQTKOfnKtHq6g0wQi3yihNb+VEhB9ErP3pxrRutY pablp5GCh9R4rkVPp6vOY3FvsdCqfgVuJSyuINUBMIBziZw8gqjyW4g+rKMTiQ+7iAn3 joyDzZv9JxrMnZQqY1fR30IY+/Cjn7FKqEjxmChi9Z0xDEdpq8vNjnoMOkSaILAHtoXj VUx8wl9/N0KuTCgKISTqoxTp+qL+82LN036aoGedR/9jg9Dmyadkhc4gltoeykyf5n8t Zrgl2QnbHrfEExdQL8Gv7/R1kg4XOPUuv8gkFB6to7V63JpnvLSVpo4lcUlLETcQL6Un 6Vvg== X-Gm-Message-State: APjAAAX+HDauyeBvs62Lmg6r1h0ApkrcBPcRYQAeMmcOUatgh3soxBx6 VvVinBMQ2V7lzmwXXFmvk8Q9N9ydchdJv3gMDVgW8bS+ces= X-Received: by 2002:aca:b9d4:: with SMTP id j203mr25350397oif.5.1563526776730; Fri, 19 Jul 2019 01:59:36 -0700 (PDT) MIME-Version: 1.0 References: <217248af-e980-9cb0-ff0d-9773413b9d38@thomaslambertz.de> In-Reply-To: <217248af-e980-9cb0-ff0d-9773413b9d38@thomaslambertz.de> From: Wanpeng Li Date: Fri, 19 Jul 2019 16:59:25 +0800 Message-ID: Subject: Re: [5.2 regression] x86/fpu changes cause crashes in KVM guest To: Thomas Lambertz Cc: Sebastian Andrzej Siewior , Rik van Riel , Dave Hansen , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , "the arch/x86 maintainers" , LKML , Paolo Bonzini , Radim Krcmar , kvm , Peter Zijlstra Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Cc kvm ml, On Thu, 18 Jul 2019 at 08:08, Thomas Lambertz wrote: > > Since kernel 5.2, I've been experiencing strange issues in my Windows 10 > QEMU/KVM guest. > Via bisection, I have tracked down that the issue lies in the FPU state > handling changes. > Kernels before 8ff468c29e9a9c3afe9152c10c7b141343270bf3 work great, the > ones afterwards are affected. > Sometimes the state seems to be restored incorrectly in the guest. > > I have managed to reproduce it relatively cleanly, on a linux guest. > (ubuntu-server 18.04, but that should not matter, since it occured on > windows aswell) > > To reproduce the issue, you need prime95 (or mprime), from > https://www.mersenne.org/download/ . > This is just a stress test for the FPU, which helps reproduce the error > much quicker. > > - Run it in the guest as 'Benchmark Only', and choose the '(2) Small > FFTs' torture test. Give it the maximum amount of cores (for me 10). > - On the host, run the same test. To keep my pc usable, I limited it to > 5 cores. I do this to put some pressure on the system. > - repeatedly focus and unfocus the qemu window > > With this config, errors in the guest usually occur within 30 seconds. > Without the refocusing, takes ~5min on average, but the variance of this > time is quite large. > > The error messages are either > "FATAL ERROR: Rounding was ......., expected less than 0.4" > or > "FATAL ERROR: Resulting sum was ....., expexted: ......", > suggesting that something in the calculation has gone wrong. > > On the host, no errors are ever observed! I found it is offended by commit 5f409e20b (x86/fpu: Defer FPU state load until return to userspace) and can only be reproduced when CONFIG_PREEMPT is enabled. Why restore qemu userspace fpu context to hardware before vmentry in the commit? https://lkml.org/lkml/2017/11/14/945 Actually I suspect the commit f775b13eedee2 (x86,kvm: move qemu/guest FPU switching out to vcpu_run) inaccurately save guest fpu state which in xsave area into the qemu userspace fpu buffer. However, Rik replied in https://lkml.org/lkml/2017/11/14/891, "The scheduler will save the guest fpu context when a vCPU thread is preempted, and restore it when it is scheduled back in." But I can't find any scheduler codes do this. In addition, below codes can fix the mprime error warning. (Still not sure it is correct) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 58305cf..18f928e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3306,6 +3306,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) kvm_x86_ops->vcpu_load(vcpu, cpu); + if (test_thread_flag(TIF_NEED_FPU_LOAD)) + switch_fpu_return(); + /* Apply any externally detected TSC adjustments (due to suspend) */ if (unlikely(vcpu->arch.tsc_offset_adjustment)) { adjust_tsc_offset_host(vcpu, vcpu->arch.tsc_offset_adjustment); @@ -7990,10 +7993,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) trace_kvm_entry(vcpu->vcpu_id); guest_enter_irqoff(); - fpregs_assert_state_consistent(); - if (test_thread_flag(TIF_NEED_FPU_LOAD)) - switch_fpu_return(); - if (unlikely(vcpu->arch.switch_db_regs)) { set_debugreg(0, 7); set_debugreg(vcpu->arch.eff_db[0], 0);