Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751623AbdH1P0V (ORCPT ); Mon, 28 Aug 2017 11:26:21 -0400 Received: from mout.gmx.net ([212.227.17.22]:56202 "EHLO mout.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751208AbdH1P0T (ORCPT ); Mon, 28 Aug 2017 11:26:19 -0400 Subject: Re: kvm splat in mmu_spte_clear_track_bits To: Adam Borowski , Paolo Bonzini Cc: Wanpeng Li , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , kvm , "linux-kernel@vger.kernel.org" References: <20170820231302.s732zclznrqxwr46@angband.pl> <20170821191203.jospdwqpnixlotx3@angband.pl> <20170821195833.GA696@flask> <20170821223228.edc6jrm7bpybtqlj@angband.pl> <1c270e76-05be-6f5f-29c6-9cb31f37f71d@redhat.com> <20170825131419.r5lzm6oluauu65nx@angband.pl> <0a85df4b-ca0a-7e70-51dc-90bd1c460c85@redhat.com> <20170827123505.u4kb24kigjqwa2t2@angband.pl> From: Bernhard Held Message-ID: <0dcca3a4-8ecd-0d05-489c-7f6d1ddb49a6@gmx.de> Date: Mon, 28 Aug 2017 17:26:05 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <20170827123505.u4kb24kigjqwa2t2@angband.pl> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K0:4qHMK4KI6l3hyRV/peE82QaUWccRFHxPlcFNIq9cE0VC6xYEMT0 sr8EhnNq71WoY06NGLTOCiiJraWiIhHv1+olzOehr6CVIkMW5uYgoLQA9/0ZE32h4Y9AMzY 9e4PncpCPs0Aflt0Rfi7/3P62NOkaOJwnga5PXUl7kOiEkowUK3mBScKG0ww3Vz5u7Nq9Cy ExbOFJitPqa84K+bbmJoQ== X-UI-Out-Filterresults: notjunk:1;V01:K0:I8rUetga1FA=:KMK1ZTfHIqkeYjkcZU3CBf 3MzGA87+DItUsbRMtLi+ex/uLPdjVXQg6I2TUHoAmp0l9VlrMSOpTD3Jb8CMvjsceEJHj2DHC gpW0tFABLHwxbZ7hD3l2TaSmwu7I8g2AuTWPIvBAWYzTA8Pl1fKGdURMC6dGN03DyUCje9RE0 V1v9X8Hn00fLjUwQPOsFdZOTP89WYtHcz2VQ6/Tow7H9Upr2HZlLp3wOOEDSe9ui/bHkny10i r7YWaIJ1GSwtxLNCRDoaxR2Z1fculRVfF6W/ygVdvSh1Zzp4BL737hL2y9O5F7/gyJTlRb9sA Q5zTdm+nLnzls6Kjz3MujNonq4Z9RnB2TQ+EtMJf2l1Nfc5CX9q5m/InlucSzzjvheQ1hjMtx XiBNzpPemeGC0WFq0fTywRL9+rVuPm2UWG2Rnot5oSrZeDd3LIJL634fD2NUFG79oM3Kyzxit ujI4N01xBheFx2nfCvY1eDelL1STC66iOakr3kTg1uh1CyREpp78AS6b+e93iERzmi+Ap+OQh AZf257DIBjG0KOtZ/viyQC+FJDpER2C77UR2LOV/SXhk1gd2ZDay6PuVxMJXcUphkL/AA3WoW 62ek+pZjg+u1KanFAU2pp9zbIsPsj95HCevSqiDN/CNGAMYjW8VwT+0YLRt4hizd6sbsSkGPV v1HltyLZWbbJC/6ddcFiVeMjlDf+HHzOUboaWJOqwCudGfRpzvrBuXHzoCCZiUhdYbe3xyBo9 IZ/WEP7iu+Q3DgGfcP+T0dxoyhCVIwzKMZMgt7BORK62hdKcH0NjKSAo6Sj8zW0Q9FpLmj41W SeSg7qzTzTkPtapQezB89nXez3B07oaL9LjFduzZCOTJsXNFVQ= Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2237 Lines: 42 On 08/27/2017 at 02:35 PM, Adam Borowski wrote: > 4.13-rc5 retested fails > Crashed only after two hours or so of testing. > > 4.13-rc4 apparently works > It survived several hours of varied tests (like 5 debian-installer runs, a > win10 point release upgrade, some hurd package building, openbsd, etc), > all while the host was likewise busy. > > Thus: to the best of my knowledge, the problem is between 4.13-rc4 and 4.13-rc5 > but I wouldn't bet my life on it. I get crashes with Win10 in kvm with 4.13-rc5. 4.13-rc4 works for me. THP seems to accelerate the crash, but that's not 100% sure. There's still no crash after reverting merge 27df70 on 4.13-rc7. There are 21 commits in this merge, 10 are mm-related: $ git log 4e082e9ba7cd..e86b298bebf7 --pretty=oneline --abbrev-commit e86b298bebf7 userfaultfd: replace ENOSPC with ESRCH in case mm has gone during copy/zeropage f357e345eef7 zram: rework copy of compressor name in comp_algorithm_store() aac2fea94f7a rmap: do not call mmu_notifier_invalidate_page() under ptl d041353dc98a mm: fix list corruptions on shmem shrinklist af54aed94bf3 mm/balloon_compaction.c: don't zero ballooned pages c0a6a5ae6b5d MAINTAINERS: copy virtio on balloon_compaction.c b3a81d0841a9 mm: fix KSM data corruption 99baac21e458 mm: fix MADV_[FREE|DONTNEED] TLB flush miss problem 0a2dd266dd6b mm: make tlb_flush_pending global 56236a59556c mm: refactor TLB gathering API a9b802500ebb Revert "mm: numa: defer TLB flush for THP migration as long as possible" 0a2c40487f3e mm: migrate: fix barriers around tlb_flush_pending 16af97dc5a89 mm: migrate: prevent racy access to tlb_flush_pending 9eeb52ae712e fault-inject: fix wrong should_fail() decision in task context 4e98ebe5f435 test_kmod: fix small memory leak on filesystem tests 9c56771316ef test_kmod: fix the lock in register_test_dev_kmod() 434b06ae23ba test_kmod: fix bug which allows negative values on two config options a4afe8cdec16 test_kmod: fix spelling mistake: "EMTPY" -> "EMPTY" 5af10dfd0afc userfaultfd: hugetlbfs: remove superfluous page unlock in VM_SHARED case 75dddef32514 mm: ratelimit PFNs busy info message d507e2ebd2c7 mm: fix global NR_SLAB_.*CLAIMABLE counter reads Any hint on what to test first is welcome! Bernhard