Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp6012671ybp; Tue, 8 Oct 2019 11:37:02 -0700 (PDT) X-Google-Smtp-Source: APXvYqxidBo0BRv0F1D09q9pd2IFd2TZ1eOCGwPPpMPHQy8bmnagx00iD5bkfoudTXa+G9j4qkWy X-Received: by 2002:a17:907:441d:: with SMTP id om21mr30816100ejb.188.1570559822139; Tue, 08 Oct 2019 11:37:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570559822; cv=none; d=google.com; s=arc-20160816; b=T3p/dGbR2f2sIMT7xNlQ2hybwbUo3FFywdbaorpo88xMpadoX7XlnOCR4yId0JIU7U qsgbiNrCPBdf3YaA3FVDIwew5p5/85csMHnwyfF+ucsi3Y1SO8o8C5SdlxyrGS5IX9vT VLmCWpvRFEZMkOall3QMjVkX6RT3+DLhOj1TqWL9JbK3xxbIKk5nzHmF748yRKunr5+J 2/wveBoFa3vrcV/RAR+nV2CVccT/Y34NXClKtGTHVwv3Jo5+vJ97Y97yXddTJ9tfOZcg rpqUhadNBis1D6huR//gZMbUXQ66qCpdiIA9ozvQsrawwDTKYlmoCvNaSOIIC49VSzY3 cM3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=YyXGkSutnrxnG4W1yY7xEFxHe9+m50hkcaWr6S7C84s=; b=zk9vQsMOgTVWGc+gTSuuZmbxk64eq8WqNHPwNzyWD2RrMh/XLwULL75cF6REZjF7QB zTDyTXAda/iuRWx8YTsNVedm8YOW9cWcer8cv8hV/lNFw9/DHxx2AY0vcO3jY9AYIft9 t0s7kzfcG3hAQuvgHwzkNasENJZSkbQgYGo791b+AjdKA18bnOgb2yP+xE6rcUpOpVGt LrNolouNEOn/V9Pi5zIWfSOkqwz2ubkDJledTD5LMailMk49MX9e/Nu3cbZf1Ufu3tVP G5sZoGPLsV4l5KVpUlcwxrrONE4DORTbiYakFZx+i9tT8rkiPViXVFKa9Yn4jyy/b+rc cTYQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id qh28si9226395ejb.210.2019.10.08.11.36.37; Tue, 08 Oct 2019 11:37:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727220AbfJHSf3 (ORCPT + 99 others); Tue, 8 Oct 2019 14:35:29 -0400 Received: from mx2.suse.de ([195.135.220.15]:52148 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726138AbfJHSf3 (ORCPT ); Tue, 8 Oct 2019 14:35:29 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id C9755AC10; Tue, 8 Oct 2019 18:35:27 +0000 (UTC) Date: Tue, 8 Oct 2019 20:35:25 +0200 From: Michal Hocko To: Qian Cai Cc: Christian Borntraeger , Petr Mladek , Peter Oberparleiter , akpm@linux-foundation.org, sergey.senozhatsky.work@gmail.com, rostedt@goodmis.org, peterz@infradead.org, linux-mm@kvack.org, john.ogness@linutronix.de, david@redhat.com, linux-kernel@vger.kernel.org, Heiko Carstens , Vasily Gorbik Subject: Re: [PATCH v2] mm/page_isolation: fix a deadlock with printk() Message-ID: <20191008183525.GQ6681@dhcp22.suse.cz> References: <1570228005-24979-1-git-send-email-cai@lca.pw> <20191007143002.l37bt2lzqtnqjqxu@pathway.suse.cz> <20191007144937.GO2381@dhcp22.suse.cz> <20191008074357.f33f6pbs4cw5majk@pathway.suse.cz> <20191008082752.GB6681@dhcp22.suse.cz> <1570550917.5576.303.camel@lca.pw> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1570550917.5576.303.camel@lca.pw> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 08-10-19 12:08:37, Qian Cai wrote: > On Tue, 2019-10-08 at 14:56 +0200, Christian Borntraeger wrote: > > Adding Peter Oberparleiter. > > Peter, can you have a look? > > > > On 08.10.19 10:27, Michal Hocko wrote: > > > On Tue 08-10-19 09:43:57, Petr Mladek wrote: > > > > On Mon 2019-10-07 16:49:37, Michal Hocko wrote: > > > > > [Cc s390 maintainers - the lockdep is http://lkml.kernel.org/r/1570228005-24979-1-git-send-email-cai@lca.pw > > > > > Petr has explained it is a false positive > > > > > http://lkml.kernel.org/r/20191007143002.l37bt2lzqtnqjqxu@pathway.suse.cz] > > > > > On Mon 07-10-19 16:30:02, Petr Mladek wrote: > > > > > [...] > > > > > > I believe that it cannot really happen because: > > > > > > > > > > > > static int __init > > > > > > sclp_console_init(void) > > > > > > { > > > > > > [...] > > > > > > rc = sclp_rw_init(); > > > > > > [...] > > > > > > register_console(&sclp_console); > > > > > > return 0; > > > > > > } > > > > > > > > > > > > sclp_rw_init() is called before register_console(). And > > > > > > console_unlock() will never call sclp_console_write() before > > > > > > the console is registered. > > > > > > > > > > > > AFAIK, lockdep only compares existing chain of locks. It does > > > > > > not know about console registration that would make some > > > > > > code paths mutually exclusive. > > > > > > > > > > > > I believe that it is a false positive. I do not know how to > > > > > > avoid this lockdep report. I hope that it will disappear > > > > > > by deferring all printk() calls rather soon. > > > > > > > > > > Thanks a lot for looking into this Petr. I have also checked the code > > > > > and I really fail to see why the allocation has to be done under the > > > > > lock in the first place. sclp_read_sccb and sclp_init_sccb are global > > > > > variables but I strongly suspect that they need a synchronization during > > > > > early init, callbacks are registered only later IIUC: > > > > > > > > Good idea. It would work when the init function is called only once. > > > > But see below. > > > > > > > > > diff --git a/drivers/s390/char/sclp.c b/drivers/s390/char/sclp.c > > > > > index d2ab3f07c008..4b1c033e3255 100644 > > > > > --- a/drivers/s390/char/sclp.c > > > > > +++ b/drivers/s390/char/sclp.c > > > > > @@ -1169,13 +1169,13 @@ sclp_init(void) > > > > > unsigned long flags; > > > > > int rc = 0; > > > > > > > > > > + sclp_read_sccb = (void *) __get_free_page(GFP_ATOMIC | GFP_DMA); > > > > > + sclp_init_sccb = (void *) __get_free_page(GFP_ATOMIC | GFP_DMA); > > > > > spin_lock_irqsave(&sclp_lock, flags); > > > > > /* Check for previous or running initialization */ > > > > > if (sclp_init_state != sclp_init_state_uninitialized) > > > > > goto fail_unlock; > > > > > > > > It seems that sclp_init() could be called several times in parallel. > > > > I see it called from sclp_register() and sclp_initcall(). > > > > > > Interesting. Something for s390 people to answer I guess. > > > Anyway, this should be quite trivial to workaround by a cmpxch or alike. > > > > > The above fix is simply insufficient, Isn't this yet another init time lockdep false possitive? > 00: [????3.654337] -> #3 (console_owner){....}:????????????????????????????????? > 00: [????3.654343]????????lock_acquire+0x21a/0x468?????????????????????????????? > 00: [????3.654345]????????console_unlock+0x3a6/0xa30???????????????????????????? > 00: [????3.654346]????????vprintk_emit+0x184/0x3c8?????????????????????????????? > 00: [????3.654348]????????vprintk_default+0x44/0x50????????????????????????????? > 00: [????3.654349]????????printk+0xa8/0xc0?????????????????????????????????????? > 00: [????3.654351]????????get_random_u64+0x40/0x108????????????????????????????? > 00: [????3.654360]????????add_to_free_area_random+0x188/0x1c0??????????????????? > 00: [????3.654364]????????free_one_page+0x72/0x128?????????????????????????????? > 00: [????3.654366]????????__free_pages_ok+0x51c/0xca0??????????????????????????? > 00: [????3.654368]????????memblock_free_all+0x30a/0x3b0????????????????????????? > 00: [????3.654370]????????mem_init+0x84/0x200??????????????????????????????????? > 00: [????3.654371]????????start_kernel+0x384/0x6a0?????????????????????????????? > 00: [????3.654373]????????startup_continue+0x70/0xd0???????????????????????????? This one is actually a nice example why trying to get printk out of the zone->lock is simply not viable. This one is likely a printk to warn that the random pool is not fully intiailized. Just because the allocator tries to randomize the initial free memory pool. You are not going to remove that printk, right? I fully agree that this class of lockdep splats are annoying especially when they make the lockdep unusable but please discuss this with lockdep maintainers and try to find some solution rather than go and try to workaround the problem all over the place. If there are places that would result in a cleaner code then go for it but please do not make the code worse just because of a non existent problem flagged by a false positive. -- Michal Hocko SUSE Labs