Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751757AbdHVQMf (ORCPT ); Tue, 22 Aug 2017 12:12:35 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57690 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751172AbdHVQML (ORCPT ); Tue, 22 Aug 2017 12:12:11 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 6696CC05678D Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=pbonzini@redhat.com Subject: Re: [BUG] Deadlock due due to interactions of block, RCU, and cpu offline To: paulmck@linux.vnet.ibm.com, Jeffrey Hugo Cc: linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, pprakash@codeaurora.org, Josh Triplett , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Jens Axboe , Sebastian Andrzej Siewior , Thomas Gleixner , Richard Cochran , Boris Ostrovsky , Richard Weinberger References: <20170327181711.GF3637@linux.vnet.ibm.com> <20170620234623.GA16200@linux.vnet.ibm.com> <20170621161853.GB3721@linux.vnet.ibm.com> <20170623033456.GA15959@linux.vnet.ibm.com> <20170628001130.GB3721@linux.vnet.ibm.com> <20170630001855.GL2393@linux.vnet.ibm.com> <20170820205658.GS11320@linux.vnet.ibm.com> From: Paolo Bonzini Message-ID: Date: Tue, 22 Aug 2017 18:12:04 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170820205658.GS11320@linux.vnet.ibm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Tue, 22 Aug 2017 16:12:11 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1335 Lines: 31 On 20/08/2017 22:56, Paul E. McKenney wrote: >> KVM: async_pf: avoid async pf injection when in guest mode >> KVM: cpuid: Fix read/write out-of-bounds vulnerability in cpuid emulation >> arm: KVM: Allow unaligned accesses at HYP >> arm64: KVM: Allow unaligned accesses at EL2 >> arm64: KVM: Preserve RES1 bits in SCTLR_EL2 >> KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages >> KVM: nVMX: Fix exception injection >> kvm: async_pf: fix rcu_irq_enter() with irqs enabled >> KVM: arm/arm64: vgic-v3: Fix nr_pre_bits bitfield extraction >> KVM: s390: fix ais handling vs cpu model >> KVM: arm/arm64: Fix isues with GICv2 on GICv3 migration >> >> Nothing really stands out to me which would "fix" the issue. > > My guess would be an undo of the change that provoked the problem > in the first place. Did you try bisecting within the above group > of commits? > > Either way, CCing Paolo for his thoughts? There is "kvm: async_pf: fix rcu_irq_enter() with irqs enabled", but it would have caused splats, not deadlocks. If you are using nested virtualization, "KVM: async_pf: avoid async pf injection when in guest mode" can be a wildcard, but only if you have memory pressure. My bet is still on the former changing the timing just a little bit. Paolo