Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754171AbdHWMXD (ORCPT ); Wed, 23 Aug 2017 08:23:03 -0400 Received: from mail-wm0-f52.google.com ([74.125.82.52]:38136 "EHLO mail-wm0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753919AbdHWMW4 (ORCPT ); Wed, 23 Aug 2017 08:22:56 -0400 Subject: Re: kvm splat in mmu_spte_clear_track_bits To: Adam Borowski , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= Cc: Wanpeng Li , kvm , "linux-kernel@vger.kernel.org" References: <20170820231302.s732zclznrqxwr46@angband.pl> <20170821191203.jospdwqpnixlotx3@angband.pl> <20170821195833.GA696@flask> <20170821223228.edc6jrm7bpybtqlj@angband.pl> From: Paolo Bonzini Message-ID: <1c270e76-05be-6f5f-29c6-9cb31f37f71d@redhat.com> Date: Wed, 23 Aug 2017 14:22:46 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170821223228.edc6jrm7bpybtqlj@angband.pl> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2856 Lines: 65 On 22/08/2017 00:32, Adam Borowski wrote: > On Mon, Aug 21, 2017 at 09:58:34PM +0200, Radim Krčmář wrote: >> 2017-08-21 21:12+0200, Adam Borowski: >>> On Mon, Aug 21, 2017 at 09:26:57AM +0800, Wanpeng Li wrote: >>>> 2017-08-21 7:13 GMT+08:00 Adam Borowski : >>>>> I'm afraid I keep getting a quite reliable, but random, splat when running >>>>> KVM: >>>> >>>> I reported something similar before. https://lkml.org/lkml/2017/6/29/64 >>> >>> Your problem seems to require OOM; I don't have any memory pressure at all: >>> running a single 2GB guest while there's nothing big on the host (bloatfox, >>> xfce, xorg, terminals + some minor junk); 8GB + (untouched) swap. There's >>> no memory pressure inside the guest either -- none was Linux (I wanted to >>> test something on hurd, kfreebsd) and I doubt they even got to use all of >>> their frames. >> >> I even tried hurd, but couldn't reproduce ... > > Also happens with a win10 guest, and with multiple Linuxes. > >> what is your qemu command >> line and the output of host's `grep . /sys/module/kvm*/parameters/*`? > > qemu-system-x86_64 -enable-kvm -m 2048 -vga qxl -usbdevice tablet \ > -net bridge -net nic \ > -drive file="$DISK",cache=writeback,index=0,media=disk,discard=on > > qemu-system-x86_64 -enable-kvm -m 2048 -vga qxl -usbdevice tablet \ > -net bridge -net nic \ > -drive file="$DISK",cache=unsafe,index=0,media=disk,discard=on,if=virtio,format=raw > > /sys/module/kvm/parameters/halt_poll_ns:200000 > /sys/module/kvm/parameters/halt_poll_ns_grow:2 > /sys/module/kvm/parameters/halt_poll_ns_shrink:0 > /sys/module/kvm/parameters/ignore_msrs:N > /sys/module/kvm/parameters/kvmclock_periodic_sync:Y > /sys/module/kvm/parameters/lapic_timer_advance_ns:0 > /sys/module/kvm/parameters/min_timer_period_us:500 > /sys/module/kvm/parameters/tsc_tolerance_ppm:250 > /sys/module/kvm/parameters/vector_hashing:Y > /sys/module/kvm_amd/parameters/avic:0 > /sys/module/kvm_amd/parameters/nested:1 > /sys/module/kvm_amd/parameters/npt:1 > /sys/module/kvm_amd/parameters/vls:0 > >>> Also, it doesn't reproduce for me on 4.12. >> >> Great info ... the most suspicious between v4.12 and v4.13-rc5 is the >> series with dcdca5fed5f6 ("x86: kvm: mmu: make spte mmio mask more >> explicit"), does reverting it help? >> >> `git revert ce00053b1cfca312c22e2a6465451f1862561eab~1..995f00a619584e65e53eff372d9b73b121a7bad5` > > Alas, doesn't seem to help. > > I've first installed a Debian stretch guest, the host survived both the > installation and subsequent fooling around. But then I started a win10 > guest which splatted as soon as the initial screen. Can you check if disabling THP on the host also fixes it for you? I would also try commit 1372324b328cd5dabaef5e345e37ad48c63df2a9 to identify whether it was caused by a KVM change in 4.13 or something else. Paolo