Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934559AbdCWQjo (ORCPT ); Thu, 23 Mar 2017 12:39:44 -0400 Received: from mail-ua0-f171.google.com ([209.85.217.171]:34531 "EHLO mail-ua0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751806AbdCWQjl (ORCPT ); Thu, 23 Mar 2017 12:39:41 -0400 MIME-Version: 1.0 In-Reply-To: <20170314151720.GA4036@potion> References: <3e72461c-7197-e941-1d35-1aca34df2f8e@redhat.com> <20170314151720.GA4036@potion> From: Dmitry Vyukov Date: Thu, 23 Mar 2017 17:39:19 +0100 Message-ID: Subject: Re: kvm: WARNING in mmu_spte_clear_track_bits To: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= Cc: Paolo Bonzini , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , "x86@kernel.org" , KVM list , LKML , Alan Stern , Steve Rutherford , Xiao Guangrong , Haozhong Zhang , syzkaller Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by mail.home.local id v2NGdo2p031102 Content-Length: 9786 Lines: 204 On Tue, Mar 14, 2017 at 4:17 PM, Radim Krčmář wrote: > 2017-03-12 12:20+0100, Dmitry Vyukov: >> On Tue, Jan 17, 2017 at 5:00 PM, Dmitry Vyukov wrote: >>> On Tue, Jan 17, 2017 at 4:20 PM, Paolo Bonzini wrote: >>>> >>>> >>>> On 13/01/2017 12:15, Dmitry Vyukov wrote: >>>>> >>>>> I've commented out the WARNING for now, but I am seeing lots of >>>>> use-after-free's and rcu stalls involving mmu_spte_clear_track_bits: >>>>> >>>>> >>>>> BUG: KASAN: use-after-free in mmu_spte_clear_track_bits+0x186/0x190 >>>>> arch/x86/kvm/mmu.c:597 at addr ffff880068ae2008 >>>>> Read of size 8 by task syz-executor2/16715 >>>>> page:ffffea00016e6170 count:0 mapcount:0 mapping: (null) index:0x0 >>>>> flags: 0x500000000000000() >>>>> raw: 0500000000000000 0000000000000000 0000000000000000 00000000ffffffff >>>>> raw: ffffea00017ec5a0 ffffea0001783d48 ffff88006aec5d98 >>>>> page dumped because: kasan: bad access detected >>>>> CPU: 2 PID: 16715 Comm: syz-executor2 Not tainted 4.10.0-rc3+ #163 >>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 >>>>> Call Trace: >>>>> __dump_stack lib/dump_stack.c:15 [inline] >>>>> dump_stack+0x292/0x3a2 lib/dump_stack.c:51 >>>>> kasan_report_error mm/kasan/report.c:213 [inline] >>>>> kasan_report+0x42d/0x460 mm/kasan/report.c:307 >>>>> __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:333 >>>>> mmu_spte_clear_track_bits+0x186/0x190 arch/x86/kvm/mmu.c:597 >>>>> drop_spte+0x24/0x280 arch/x86/kvm/mmu.c:1182 >>>>> kvm_zap_rmapp+0x119/0x260 arch/x86/kvm/mmu.c:1401 >>>>> kvm_unmap_rmapp+0x1d/0x30 arch/x86/kvm/mmu.c:1412 >>>>> kvm_handle_hva_range+0x54a/0x7d0 arch/x86/kvm/mmu.c:1565 >>>>> kvm_unmap_hva_range+0x2e/0x40 arch/x86/kvm/mmu.c:1591 >>>>> kvm_mmu_notifier_invalidate_range_start+0xae/0x140 >>>>> arch/x86/kvm/../../../virt/kvm/kvm_main.c:360 >>>>> __mmu_notifier_invalidate_range_start+0x1f8/0x300 mm/mmu_notifier.c:199 >>>>> mmu_notifier_invalidate_range_start include/linux/mmu_notifier.h:282 [inline] >>>>> unmap_vmas+0x14b/0x1b0 mm/memory.c:1368 >>>>> unmap_region+0x2f8/0x560 mm/mmap.c:2460 >>>>> do_munmap+0x7b8/0xfa0 mm/mmap.c:2657 >>>>> mmap_region+0x68f/0x18e0 mm/mmap.c:1612 >>>>> do_mmap+0x6a2/0xd40 mm/mmap.c:1450 >>>>> do_mmap_pgoff include/linux/mm.h:2031 [inline] >>>>> vm_mmap_pgoff+0x1a9/0x200 mm/util.c:305 >>>>> SYSC_mmap_pgoff mm/mmap.c:1500 [inline] >>>>> SyS_mmap_pgoff+0x22c/0x5d0 mm/mmap.c:1458 >>>>> SYSC_mmap arch/x86/kernel/sys_x86_64.c:95 [inline] >>>>> SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:86 >>>>> entry_SYSCALL_64_fastpath+0x1f/0xc2 >>>>> RIP: 0033:0x445329 >>>>> RSP: 002b:00007fb33933cb58 EFLAGS: 00000282 ORIG_RAX: 0000000000000009 >>>>> RAX: ffffffffffffffda RBX: 0000000020000000 RCX: 0000000000445329 >>>>> RDX: 0000000000000003 RSI: 0000000000af1000 RDI: 0000000020000000 >>>>> RBP: 00000000006dfe90 R08: ffffffffffffffff R09: 0000000000000000 >>>>> R10: 0000000000000032 R11: 0000000000000282 R12: 0000000000700000 >>>>> R13: 0000000000000006 R14: ffffffffffffffff R15: 0000000020001000 >>>>> Memory state around the buggy address: >>>>> ffff880068ae1f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>>>> ffff880068ae1f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>>>>> ffff880068ae2000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >>>>> ^ >>>>> ffff880068ae2080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >>>>> ffff880068ae2100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >>>>> ================================================================== >>>> >>>> This could be related to the gfn_to_rmap issues. >>> >>> >>> Humm... That's possible. Potentially I am not seeing any more of >>> spte-related crashes after I applied the following patch: >>> >>> --- a/virt/kvm/kvm_main.c >>> +++ b/virt/kvm/kvm_main.c >>> @@ -968,8 +968,7 @@ int __kvm_set_memory_region(struct kvm *kvm, >>> /* Check for overlaps */ >>> r = -EEXIST; >>> kvm_for_each_memslot(slot, __kvm_memslots(kvm, as_id)) { >>> - if ((slot->id >= KVM_USER_MEM_SLOTS) || >>> - (slot->id == id)) >>> + if (slot->id == id) >>> continue; >>> if (!((base_gfn + npages <= slot->base_gfn) || >>> (base_gfn >= slot->base_gfn + slot->npages))) > > I don't understand how this fixes the test: the only memslot that the > test creates is at memory range 0x0-0x1000, which should not overlap > with any private memslots. > There should be just the IDENTITY_PAGETABLE_PRIVATE_MEMSLOT @ > 0xfffbc000ul. > > Do you get any ouput with this hunk? > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index a17d78759727..7e1929432232 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -888,6 +888,14 @@ static struct kvm_memslots *install_new_memslots(struct kvm *kvm, > return old_memslots; > } > > +void kvm_dump_slot(struct kvm_memory_slot *slot) > +{ > + printk("kvm_memory_slot %p { .id = %u, .base_gfn = %#llx, .npages = %lu, " > + ".userspace_addr = %#lx, .flags = %u, .dirty_bitmap = %p, .arch = ? }\n", > + slot, slot->id, slot->base_gfn, slot->npages, > + slot->userspace_addr, slot->flags, slot->dirty_bitmap); > +} > + > /* > * Allocate some memory and give it an address in the guest physical address > * space. > @@ -978,12 +986,14 @@ int __kvm_set_memory_region(struct kvm *kvm, > /* Check for overlaps */ > r = -EEXIST; > kvm_for_each_memslot(slot, __kvm_memslots(kvm, as_id)) { > - if ((slot->id >= KVM_USER_MEM_SLOTS) || > - (slot->id == id)) > + if (slot->id == id) > continue; > if (!((base_gfn + npages <= slot->base_gfn) || > - (base_gfn >= slot->base_gfn + slot->npages))) > + (base_gfn >= slot->base_gfn + slot->npages))) { > + kvm_dump_slot(&new); > + kvm_dump_slot(slot); > goto out; > + } > } > } > > >> Friendly ping. Just hit it on > > And the warning happens at mmap ... I can't reproduce, but does the bug > happen on the second mmap()? (Test line 210 when i = 0.) > > The change above makes sense as memslots currently cannot overlap > anywhere. There are three private memslots that can cause this problem: > TSS, IDENTITY_MAP and APIC. > > TSS and IDENTITY_MAP can be configured by userspace and must not > conflict by design, so we can safely enforce that. > APIC memslot doesn't provide such guarantees and should be overlaid over > any memory, but assuming that userspace doesn't configure memslots there > seems bearable. > > Still, I'd like to understand why that patch would fix this bug. > > Thanks. Humm... I cannot reproduce it anymore. Maybe it was fixed by something else... However this looks very close and is still not fixed: https://groups.google.com/d/msg/syzkaller/IqkesiRS-t0/aLcJuMXqBgAJ Maybe it's another reincarnation of the same problem... >> mmotm/86292b33d4b79ee03e2f43ea0381ef85f077c760 (without the above >> change): >> >> ------------[ cut here ]------------ >> WARNING: CPU: 1 PID: 31060 at arch/x86/kvm/mmu.c:682 >> mmu_spte_clear_track_bits+0x3a1/0x420 arch/x86/kvm/mmu.c:682 >> CPU: 1 PID: 31060 Comm: syz-executor0 Not tainted 4.11.0-rc1+ #328 >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 >> Call Trace: >> __dump_stack lib/dump_stack.c:16 [inline] >> dump_stack+0x1a7/0x26a lib/dump_stack.c:52 >> panic+0x1f8/0x40f kernel/panic.c:180 >> __warn+0x1c4/0x1e0 kernel/panic.c:541 >> warn_slowpath_null+0x2c/0x40 kernel/panic.c:584 >> mmu_spte_clear_track_bits+0x3a1/0x420 arch/x86/kvm/mmu.c:682 >> drop_spte+0x24/0x280 arch/x86/kvm/mmu.c:1323 >> mmu_page_zap_pte+0x223/0x350 arch/x86/kvm/mmu.c:2438 >> kvm_mmu_page_unlink_children arch/x86/kvm/mmu.c:2460 [inline] >> kvm_mmu_prepare_zap_page+0x1ce/0x13d0 arch/x86/kvm/mmu.c:2504 >> kvm_zap_obsolete_pages arch/x86/kvm/mmu.c:5134 [inline] >> kvm_mmu_invalidate_zap_all_pages+0x4d4/0x6b0 arch/x86/kvm/mmu.c:5175 >> kvm_arch_flush_shadow_all+0x15/0x20 arch/x86/kvm/x86.c:8364 >> kvm_mmu_notifier_release+0x71/0xb0 >> arch/x86/kvm/../../../virt/kvm/kvm_main.c:472 >> __mmu_notifier_release+0x1e5/0x6b0 mm/mmu_notifier.c:75 >> mmu_notifier_release include/linux/mmu_notifier.h:235 [inline] >> exit_mmap+0x3a3/0x470 mm/mmap.c:2941 >> __mmput kernel/fork.c:890 [inline] >> mmput+0x228/0x700 kernel/fork.c:912 >> exit_mm kernel/exit.c:558 [inline] >> do_exit+0x9e8/0x1c20 kernel/exit.c:866 >> do_group_exit+0x149/0x400 kernel/exit.c:983 >> get_signal+0x6d9/0x1840 kernel/signal.c:2318 >> do_signal+0x94/0x1f30 arch/x86/kernel/signal.c:808 >> exit_to_usermode_loop+0x1e5/0x2d0 arch/x86/entry/common.c:157 >> prepare_exit_to_usermode arch/x86/entry/common.c:191 [inline] >> syscall_return_slowpath+0x3bd/0x460 arch/x86/entry/common.c:260 >> entry_SYSCALL_64_fastpath+0xc0/0xc2 >> RIP: 0033:0x4458d9 >> RSP: 002b:00007ffa472c3b58 EFLAGS: 00000286 ORIG_RAX: 00000000000000ce >> RAX: fffffffffffffff4 RBX: 0000000000708000 RCX: 00000000004458d9 >> RDX: 0000000000000000 RSI: 000000002006bff8 RDI: 000000000000a05b >> RBP: 0000000000000fe0 R08: 0000000000000000 R09: 0000000000000000 >> R10: 0000000000000000 R11: 0000000000000286 R12: 00000000006df0a0 >> R13: 000000000000a05b R14: 000000002006bff8 R15: 0000000000000000