Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753447AbdCONgG (ORCPT ); Wed, 15 Mar 2017 09:36:06 -0400 Received: from mail-wm0-f52.google.com ([74.125.82.52]:34274 "EHLO mail-wm0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751152AbdCONf4 (ORCPT ); Wed, 15 Mar 2017 09:35:56 -0400 Date: Wed, 15 Mar 2017 14:35:42 +0100 From: Christoffer Dall To: Marc Zyngier Cc: Suzuki K Poulose , linux-arm-kernel@lists.infradead.org, andreyknvl@google.com, dvyukov@google.com, christoffer.dall@linaro.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, kcc@google.com, syzkaller@googlegroups.com, will.deacon@arm.com, catalin.marinas@arm.com, pbonzini@redhat.com, mark.rutland@arm.com, ard.biesheuvel@linaro.org, stable@vger.kernel.org Subject: Re: [PATCH 3/3] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd Message-ID: <20170315133542.GP1277@cbox> References: <1489503154-20705-1-git-send-email-suzuki.poulose@arm.com> <1489503154-20705-4-git-send-email-suzuki.poulose@arm.com> <20170315092147.GM1277@cbox> <314fbde3-17e6-414b-85e6-326de22bdc1c@arm.com> <20170315105639.GA31974@cbox> <0e5ff7f7-855c-ea28-fdee-73c062c3d289@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0e5ff7f7-855c-ea28-fdee-73c062c3d289@arm.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3143 Lines: 77 On Wed, Mar 15, 2017 at 01:28:07PM +0000, Marc Zyngier wrote: > On 15/03/17 10:56, Christoffer Dall wrote: > > On Wed, Mar 15, 2017 at 09:39:26AM +0000, Marc Zyngier wrote: > >> On 15/03/17 09:21, Christoffer Dall wrote: > >>> On Tue, Mar 14, 2017 at 02:52:34PM +0000, Suzuki K Poulose wrote: > >>>> In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling > >>>> unmap_stage2_range() on the entire memory range for the guest. This could > >>>> cause problems with other callers (e.g, munmap on a memslot) trying to > >>>> unmap a range. > >>>> > >>>> Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup") > >>>> Cc: stable@vger.kernel.org # v3.10+ > >>>> Cc: Marc Zyngier > >>>> Cc: Christoffer Dall > >>>> Signed-off-by: Suzuki K Poulose > >>>> --- > >>>> arch/arm/kvm/mmu.c | 3 +++ > >>>> 1 file changed, 3 insertions(+) > >>>> > >>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c > >>>> index 13b9c1f..b361f71 100644 > >>>> --- a/arch/arm/kvm/mmu.c > >>>> +++ b/arch/arm/kvm/mmu.c > >>>> @@ -831,7 +831,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm) > >>>> if (kvm->arch.pgd == NULL) > >>>> return; > >>>> > >>>> + spin_lock(&kvm->mmu_lock); > >>>> unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE); > >>>> + spin_unlock(&kvm->mmu_lock); > >>>> + > >>> > >>> This ends up holding the spin lock for potentially quite a while, where > >>> we can do things like __flush_dcache_area(), which I think can fault. > >> > >> I believe we're always using the linear mapping (or kmap on 32bit) in > >> order not to fault. > >> > > > > ok, then there's just the concern that we may be holding a spinlock for > > a very long time. I seem to recall Mario once added something where he > > unlocked and gave a chance to schedule something else for each PUD or > > something like that, because he ran into the issue during migration. Am > > I confusing this with something else? > > That definitely rings a bell: stage2_wp_range() uses that kind of trick > to give the system a chance to breathe. Maybe we could use a similar > trick in our S2 unmapping code? How about this (completely untested) patch: > > diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c > index 962616fd4ddd..1786c24212d4 100644 > --- a/arch/arm/kvm/mmu.c > +++ b/arch/arm/kvm/mmu.c > @@ -292,8 +292,13 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size) > phys_addr_t addr = start, end = start + size; > phys_addr_t next; > > + BUG_ON(!spin_is_locked(&kvm->mmu_lock)); > + > pgd = kvm->arch.pgd + stage2_pgd_index(addr); > do { > + if (need_resched() || spin_needbreak(&kvm->mmu_lock)) > + cond_resched_lock(&kvm->mmu_lock); > + > next = stage2_pgd_addr_end(addr, end); > if (!stage2_pgd_none(*pgd)) > unmap_stage2_puds(kvm, pgd, addr, next); > > The additional BUG_ON() is just for my own peace of mind - we seem to > have missed a couple of these lately, and the "breathing" code makes > it imperative that this lock is being taken prior to entering the > function. > Looks good to me! -Christoffer