Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751304AbdH2SeL (ORCPT ); Tue, 29 Aug 2017 14:34:11 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55880 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751186AbdH2SeJ (ORCPT ); Tue, 29 Aug 2017 14:34:09 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 4FA7E80463 Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=jglisse@redhat.com Date: Tue, 29 Aug 2017 14:34:05 -0400 From: Jerome Glisse To: Andrea Arcangeli Cc: Adam Borowski , Takashi Iwai , Bernhard Held , Nadav Amit , Paolo Bonzini , Wanpeng Li , Radim =?utf-8?B?S3LEjW3DocWZ?= , Joerg Roedel , "Kirill A. Shutemov" , Andrew Morton , Linus Torvalds , kvm , "linux-kernel@vger.kernel.org" , Michal Hocko Subject: Re: kvm splat in mmu_spte_clear_track_bits Message-ID: <20170829183405.GB7546@redhat.com> References: <20170825131419.r5lzm6oluauu65nx@angband.pl> <0a85df4b-ca0a-7e70-51dc-90bd1c460c85@redhat.com> <20170827123505.u4kb24kigjqwa2t2@angband.pl> <0dcca3a4-8ecd-0d05-489c-7f6d1ddb49a6@gmx.de> <79BC5306-4ED4-41E4-B2C1-12197D9D1709@gmail.com> <20170829125923.g3tp22bzsrcuruks@angband.pl> <20170829140924.GB21615@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20170829140924.GB21615@redhat.com> User-Agent: Mutt/1.8.3 (2017-05-23) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Tue, 29 Aug 2017 18:34:09 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2490 Lines: 58 On Tue, Aug 29, 2017 at 04:09:24PM +0200, Andrea Arcangeli wrote: > Hello, > > On Tue, Aug 29, 2017 at 02:59:23PM +0200, Adam Borowski wrote: > > On Tue, Aug 29, 2017 at 02:45:41PM +0200, Takashi Iwai wrote: > > > [Put more people to Cc, sorry for growing too much...] > > > > We're all interested in 4.13.0 not crashing on us, so that's ok. > > > > > On Tue, 29 Aug 2017 11:19:13 +0200, > > > Bernhard Held wrote: > > > > > > > > On 08/28/2017 at 06:56 PM, Nadav Amit wrote: > > > > > Don’t blame me for the TLB stuff... My money is on aac2fea94f7a . > > > > > > > > Amit, thanks for your courage to expose your patch! > > > > > > > > I'm more and more confident that aac2fea94f7a is the culprit. Maybe it > > > > just accelerates the triggering of the splash. To be more sure the > > > > kernel needs to be tested for a couple of days. It would be great if > > > > others could assist in testing aac2fea94f7a. > > > > > > I'm testing with the revert for a while and it seems working. > > > > With nothing but aac2fea94f7a reverted, no explosions for me either. > > The aforementioned commit has 3 bugs. > > 1) mmu_notifier_invalidate_range cannot be used in replacement of > mmu_notifier_invalidate_range_start/end. For KVM > mmu_notifier_invalidate_range is a noop and rightfully so. A MMU > notifier implementation has to implement either > ->invalidate_range method or the invalidate_range_start/end > methods, not both. And if you implement invalidate_range_start/end > like KVM is forced to do, calling mmu_notifier_invalidate_range in > common code is a noop for KVM. > > For those MMU notifiers that can get away only implementing > ->invalidate_range, the ->invalidate_range is implicitly called by > mmu_notifier_invalidate_range_end(). And only those secondary MMUs > that share the same pagetable with the primary MMU (like AMD > iommuv2) can get away only implementing ->invalidate_range. > > So all cases (THP on/off) are broken right now. > > To fix this is enough to replace mmu_notifier_invalidate_range with > mmu_notifier_invalidate_range_start;mmu_notifier_invalidate_range_end. Either > that or call multiple mmu_notifier_invalidate_page like before. Kirill did regress invalidate_page as it use to be call outside the spinlock and now it is call inside the spinlock thus reverting will introduce back a regression. You can refer to the thread about it: https://lkml.org/lkml/2017/8/9/418 Jérôme