Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754388AbbEKOAM (ORCPT ); Mon, 11 May 2015 10:00:12 -0400 Received: from mail-wi0-f172.google.com ([209.85.212.172]:36363 "EHLO mail-wi0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753809AbbEKOAI (ORCPT ); Mon, 11 May 2015 10:00:08 -0400 Date: Mon, 11 May 2015 16:00:03 +0200 From: Ingo Molnar To: Chris J Arges Cc: Linus Torvalds , Rafael David Tinoco , Peter Anvin , Jiang Liu , Peter Zijlstra , LKML , Jens Axboe , Frederic Weisbecker , Gema Gomez , the arch/x86 maintainers Subject: Re: [PATCH] smp/call: Detect stuck CSD locks Message-ID: <20150511140003.GA5354@gmail.com> References: <20150407092121.GA9971@gmail.com> <20150407205945.GA28212@canonical.com> <20150408064734.GA26861@gmail.com> <20150413035616.GA24037@canonical.com> <20150413061450.GA10857@gmail.com> <20150415195452.GA19953@canonical.com> <20150416110423.GA15760@gmail.com> <20150416155819.GA20490@canonical.com> <20150416163140.GA17024@gmail.com> <20150429210831.GA17055@canonical.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150429210831.GA17055@canonical.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3213 Lines: 57 * Chris J Arges wrote: > Later in the trace we see the same call followed by > vmx_handle_external_intr() ignoring the call: > > [ 603.248016] 2452.083823 | 0) | ptep_clear_flush() { > [ 603.248016] 2452.083824 | 0) | flush_tlb_page() { > [ 603.248016] 2452.083824 | 0) 0.109 us | leave_mm(); > [ 603.248016] 2452.083824 | 0) | native_flush_tlb_others() { > [ 603.248016] 2452.083824 | 0) | smp_call_function_many() { > [ 603.248016] 2452.083825 | 0) | smp_call_function_single() { > [ 603.248016] 2452.083825 | 0) | generic_exec_single() { > [ 603.248016] 2452.083825 | 0) | native_send_call_func_single_ipi() { > [ 603.248016] 2452.083825 | 0) | x2apic_send_IPI_mask() { > [ 603.248016] 2452.083826 | 0) 1.625 us | __x2apic_send_IPI_mask(); > [ 603.248016] 2452.083828 | 0) 2.173 us | } > [ 603.248016] 2452.083828 | 0) 2.588 us | } > [ 603.248016] 2452.083828 | 0) 3.082 us | } > [ 603.248016] 2452.083828 | 0) | csd_lock_wait.isra.4() { > [ 603.248016] 2452.083848 | 1) + 44.033 us | } > [ 603.248016] 2452.083849 | 1) 0.975 us | vmx_read_l1_tsc(); > [ 603.248016] 2452.083851 | 1) 1.031 us | vmx_handle_external_intr(); > [ 603.248016] 2452.083852 | 1) 0.234 us | __srcu_read_lock(); > [ 603.248016] 2452.083853 | 1) | vmx_handle_exit() { > [ 603.248016] 2452.083854 | 1) | handle_ept_violation() { > [ 603.248016] 2452.083856 | 1) | kvm_mmu_page_fault() { > [ 603.248016] 2452.083856 | 1) | tdp_page_fault() { > [ 603.248016] 2452.083856 | 1) 0.092 us | mmu_topup_memory_caches(); > [ 603.248016] 2452.083857 | 1) | gfn_to_memslot_dirty_bitmap.isra.84() { > [ 603.248016] 2452.083857 | 1) 0.231 us | gfn_to_memslot(); > [ 603.248016] 2452.083858 | 1) 0.774 us | } > > So potentially, CPU0 generated an interrupt that caused > vcpu_enter_guest to be called on CPU1. However, when > vmx_handle_external_intr was called, it didn't progress any further. So the IPI does look like to be lost in the KVM code? So why did vmx_handle_external_intr() skip the irq injection - were IRQs disabled in the guest perhaps? > Another experiment here would be to dump > vmcs_read32(VM_EXIT_INTR_INFO); to see why we don't handle the > interrupt. Possibly, but also to instrument the KVM IRQ injection code to see when it skips an IPI and why. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/