Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp5669460ybp; Tue, 8 Oct 2019 06:38:33 -0700 (PDT) X-Google-Smtp-Source: APXvYqwgvcfWB7GU1R6DZ4YNecMF4oxGwmUR5BaUoKSWOKwrB44EGmm+zoFnvCQ0tWaKoocIJHaX X-Received: by 2002:a50:935d:: with SMTP id n29mr34084414eda.294.1570541913498; Tue, 08 Oct 2019 06:38:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570541913; cv=none; d=google.com; s=arc-20160816; b=viz2gZET4M9VfaHdfukYM37pFYQZjgn1Sk6G7MjU/D/bPpfy+vcPYaQjP38/A0PuR5 fOXnldtILBZrK9VLjnORPvzCqcW1jhvIxHhwZf2cUm+Lp2tVU0FwGG94ywoGmQO1zUqB ydAzS5v5WiIkkpVIAp3J7Ade0I/jfLNbGDNL+yEkDYOkd9cHfjtyUlsepp1n+qsVEF4h bcSlA+yBWZlGOOYFjEcfqcF70ZagWE2TVVPKCzGSqn7EdUOKOTRYS2tFFaCvRLDIUGH6 U3Fs9OQ8LFurJPXXiiVMqdL5kMrZVVCXLQ7ZgvS1XFt7k+zHoKezM3V/C/b+srr7Xo6g LFTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=95uAnCxKr0Ms1niSuRcF298CUcDXSNtM0VAwfcBftls=; b=X63U+ehWIttNw5fttmvSHNhoAfaPUYrqJo1hGrufmNBdGulW6woaicM7CblFRsFLH9 b8xCj9Cn/JmxQaooxuUZF51eL9qtjKf4nL9spgqs8UoEwq65PSnFqWeDpHGkY1B2L2s6 pQPoLaHdEjuEUmajBEGKWi80nc1w3ddsFmZ00eIzjeszn12gD3mVRb7Sul4xbqNOxRjH iwf19nMyTHPiDmV5jsA0SyToplFAs82hN5lF/JqoEy+gnbN+NUHhbiYYcqZmjX7WGSEt oVlNNq9luf4SVoQGW2ZKKwCmN/LOwzDi/0D7cLPUGMEBjhmv5kaI4u/x64U2ydlDUqV9 CyKQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id oh20si8852842ejb.323.2019.10.08.06.38.09; Tue, 08 Oct 2019 06:38:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725972AbfJHNh3 (ORCPT + 99 others); Tue, 8 Oct 2019 09:37:29 -0400 Received: from mx2.suse.de ([195.135.220.15]:41526 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725917AbfJHNh3 (ORCPT ); Tue, 8 Oct 2019 09:37:29 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 0BC46B021; Tue, 8 Oct 2019 13:37:28 +0000 (UTC) Date: Tue, 8 Oct 2019 15:37:27 +0200 From: Michal Hocko To: Qian Cai Cc: Petr Mladek , akpm@linux-foundation.org, sergey.senozhatsky.work@gmail.com, rostedt@goodmis.org, peterz@infradead.org, linux-mm@kvack.org, john.ogness@linutronix.de, david@redhat.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] mm/page_isolation: fix a deadlock with printk() Message-ID: <20191008133727.GK6681@dhcp22.suse.cz> References: <20191008103907.GE6681@dhcp22.suse.cz> <3836DE34-9DD2-4815-9E1E-CB87D881B9AD@lca.pw> <20191008123920.GI6681@dhcp22.suse.cz> <1570539989.5576.295.camel@lca.pw> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1570539989.5576.295.camel@lca.pw> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 08-10-19 09:06:29, Qian Cai wrote: > On Tue, 2019-10-08 at 14:39 +0200, Michal Hocko wrote: > > On Tue 08-10-19 08:00:43, Qian Cai wrote: > > > > > > > > > > On Oct 8, 2019, at 6:39 AM, Michal Hocko wrote: > > > > > > > > Have you actually triggered any real deadlock? With a zone->lock in > > > > place it would be pretty clear with hard lockups detected. > > > > > > Yes, I did trigger here and there, and those lockdep splats are > > > especially useful to figure out why. > > > > Can you provide a lockdep splat from an actual deadlock please? I am > > sorry but your responses tend to be really cryptic and I never know when > > you are talking about actual deadlocks and lockdep splats. I have asked > > about the former several times never receiving a specific answer. > > It is very time-consuming to confirm a lockdep splat is 100% matching a deadlock > giving that it is not able to reproduce on will yet, so when I did encounter a > memory offline deadlock where "echo offline > memory/state" just hang, but there > is no hard lockup probably because the hard lockup detector did not work > properly for some reasons or it keep trying to acquire a spin lock that only > keep the CPU 100%. If there is a real deadlock due to zone->lock then you would certainly get a hard lockup splat. So I strongly suspect that you are seeing a completely different problem. Most likely some pages cannot be migrated and the offlining code will retry for ever. You can terminate that from the userspace by a fatal signal of course. -- Michal Hocko SUSE Labs