Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1050140pxb; Thu, 28 Jan 2021 07:00:55 -0800 (PST) X-Google-Smtp-Source: ABdhPJxGyR2BVB10JlxtYp0TkuI0cG9XiSF6Po4BjeTBtO+IaQneUyKxKNXLhpTFzM192Ja+tx+o X-Received: by 2002:a17:906:14d5:: with SMTP id y21mr11620208ejc.410.1611846055002; Thu, 28 Jan 2021 07:00:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611846054; cv=none; d=google.com; s=arc-20160816; b=UYawSDj3rEvL8cYzn+HPHLvpuUT0MTFlPWQlDG5n6481jdlheXaiUwlWj3jMt0Lr+w ZhZY9LgEcdPVjm1aCsDbW1ZmBZhVGBfu0bkyqAtXXD2ZlLIibFSwp51+4NfmOpRBdQCP Nu3zdSF1eWb33wp3O4MVooVbfyOB4sHaRboS43DXMs0q3qk7qC1I2XKoaihek8hSdFxu 9E4huH0XS+r9KrO2qZcghj4VSf4onEULr/9bwf3t4Ht79DzQDnWt80TXcEhsqXs5UYj5 i/QICrbgvH9lACnL2UofV08iKL1sNHkaVctzy2m8ZP3eKW7JGFIAQbbvklYVM/pKZMt2 TAqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=Pdf8BzcmNA4f27mr+iRosPThSbD6Q65vzoQoO5RPf9E=; b=S41CBX6l/6//5QDL23cuBDmXeLaDmmcbkTteFVjHmIMnpW4VIJGv1CLUnE30i5PYML F5/LPBwudji1xHfd8Vmv4F8OgxuXUv2iwSUtGfBwyF4H0A9jOvre1uyxODfTFmesEhB5 fkUkdWKKqwUeB1abldtFSeLI9++oB/L7ipDML8r0n7j2aJ5kUcbUA7H4HPAFxz361zMb zjII3YUDqaOkICZ2xfqwd8oNxR5hsMG7jxz297Wili6wEagtyrK1Hzu5KDCJh9XZycjf HhbC7T9UO2o4CjiqgCsDbS0fC/trIw3YsWu6iM8HNDutb501rGz6kCTg5Le6hXWTZug/ hUgA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=OguBSFVk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d20si2913266ejd.409.2021.01.28.07.00.29; Thu, 28 Jan 2021 07:00:54 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=OguBSFVk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232108AbhA1O7n (ORCPT + 99 others); Thu, 28 Jan 2021 09:59:43 -0500 Received: from mail.kernel.org ([198.145.29.99]:52322 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232298AbhA1O6z (ORCPT ); Thu, 28 Jan 2021 09:58:55 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id C7FAE64DE8; Thu, 28 Jan 2021 14:58:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1611845893; bh=70skHL26c2uu6S1xdkaZjxMf+ykcMABGtYclFtKAfcg=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=OguBSFVkSHqI1jyK8IOJ9tmmFkeG1MMSkHM0jk/O9eV6uMTB799yFf3Ck73uLHEyj laVerO1xPgp3QSNZWAyOHQUnVXPWo9rsuRxuEov/KoMixq+T7WTF1XK208Xoj+zG+j /Wkb77qyiKjalBH6RDzuLIPMXUCuD5aTdTxAEBGy4goJ4b9nedbXj7kOUubpF8TC2j uvOTgbrvrwq3xUpwQsZTyjGW8oaRJmlvncMOjvtthOLGXiPMnjkHXtzU64sPVUQMx7 7lGkAP8UtbHsl6Balo/4JxkZsrcSj8HVrFm9FivltvU3Jd4PuUmEhi0nj5PRO5UbAe CNkdk9hObQWOw== Received: by paulmck-ThinkPad-P72.home (Postfix, from userid 1000) id 61CD03522748; Thu, 28 Jan 2021 06:58:13 -0800 (PST) Date: Thu, 28 Jan 2021 06:58:13 -0800 From: "Paul E. McKenney" To: Dexuan Cui Cc: Neeraj Upadhyay , "boqun.feng@gmail.com" , Ingo Molnar , "rcu@vger.kernel.org" , vkuznets , Michael Kelley , "linux-kernel@vger.kernel.org" Subject: Re: kdump always hangs in rcu_barrier() -> wait_for_completion() Message-ID: <20210128145813.GO2743@paulmck-ThinkPad-P72> Reply-To: paulmck@kernel.org References: <20201126154630.GR1437@paulmck-ThinkPad-P72> <20201126214226.GS1437@paulmck-ThinkPad-P72> <20201126235440.GT1437@paulmck-ThinkPad-P72> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 28, 2021 at 07:28:20AM +0000, Dexuan Cui wrote: > > From: Paul E. McKenney > > Sent: Thursday, November 26, 2020 3:55 PM > > To: Dexuan Cui > > Cc: boqun.feng@gmail.com; Ingo Molnar ; > > rcu@vger.kernel.org; vkuznets ; Michael Kelley > > ; linux-kernel@vger.kernel.org > > Subject: Re: kdump always hangs in rcu_barrier() -> wait_for_completion() > > > > On Thu, Nov 26, 2020 at 10:59:19PM +0000, Dexuan Cui wrote: > > > > From: Paul E. McKenney > > > > Sent: Thursday, November 26, 2020 1:42 PM > > > > > > > > > > Another possibility is that rcu_state.gp_kthread is non-NULL, but that > > > > > > something else is preventing RCU grace periods from completing, but in > > > > > > > > > > It looks like somehow the scheduling is not working here: in rcu_barrier() > > > > > , if I replace the wait_for_completion() with > > > > > wait_for_completion_timeout(&rcu_state.barrier_completion, 30*HZ), > > the > > > > > issue persists. > > > > > > > > Have you tried using sysreq-t to see what the various tasks are doing? > > > > > > Will try it. > > > > > > BTW, this is a "Generation 2" VM on Hyper-V, meaning sysrq only starts to > > > work after the Hyper-V para-virtualized keyboard driver loads... So, at this > > > early point, sysrq is not working. :-( I'll have to hack the code and use a > > > virtual NMI interrupt to force the sysrq handler to be called. > > > > Whatever works! > > > > > > Having interrupts disabled on all CPUs would have the effect of disabling > > > > the RCU CPU stall warnings. > > > > Thanx, Paul > > > > > > I'm sure the interrupts are not disabled. Here the VM only has 1 virtual CPU, > > > and when the hang issue happens the virtual serial console is still responding > > > when I press Enter (it prints a new line) or Ctrl+C (it prints ^C). > > > > > > Here the VM does not use the "legacy timers" (PIT, Local APIC timer, etc.) at > > all. > > > Instead, the VM uses the Hyper-V para-virtualized timers. It looks the > > Hyper-V > > > timer never fires in the kdump kernel when the hang issue happens. I'm > > > looking into this... I suspect this hang issue may only be specific to Hyper-V. > > > > Fair enough, given that timers not working can also suppress RCU CPU > > stall warnings. ;-) > > > > Thanx, Paul > > FYI: the issue has been fixed by this fix: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fff7b5e6ee63c5d20406a131b260c619cdd24fd1 Thank you for the update! Thanx, Paul