Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp649505yba; Fri, 3 May 2019 08:16:35 -0700 (PDT) X-Google-Smtp-Source: APXvYqxKc40KW10c76nr3TPd31OJQC25vB5WO16xWduqF+ezhMXlDmCUAleaLwA8jPCSPoUDEJIN X-Received: by 2002:a6b:fe01:: with SMTP id x1mr6434335ioh.4.1556896595250; Fri, 03 May 2019 08:16:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556896595; cv=none; d=google.com; s=arc-20160816; b=KdUYtt8kgWs+UTGM+sPHbWHlsgWpRmYg+jp/fdllj8140i9wISC2PtXjmtVIKsPwTv iPlX/uq+dfC8uRys+jipehixbR3dC5W+SWflFWXAM7uPFY497QTc1gLFJ66SgAB17AzI BDhkf3F5qim27lxMy5JkDXztZI1EeNEO/W4IFvqi5iJyQREQkRFtiuEtZQFFsDlLsT7v NspUKO+Szb+Ro1Hv+b3cppZf+GWv2csKU98gEDrEiUDwDUv1Q6GdmYoJnB5LrQVOmK7m 0JvCyBcLU3IjbiiRNzpR9DVgMV2xNN/Y5rYIZDsexMEv+JVmO08CRxFLK0ex6E40/fPO CXyQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=I6HJ+8Ar5J6cYGKIct/rhUDCboGfFcg4EzdBKp1KOG8=; b=s9HMuUPRV8Wf5JYDiAU6DfVcUUAp3lIJyLrcifHUNinLov+uYpaUXf6bWOzHmG+35x hE85XQZWgu4G4RsR2auNLl+gz9MrJfeAyYHQRwogZaGFWEcGT+I99PHYkDM2Xg98dwld 1bdlMSBsTGpUzO5gyceJiZ2B98ST1/UX0Z6728R7RCOVLes2Z6IN/V0H4ta9kC/ByRbr oND2tIBEbCs9GTc5D7MsYS/Z84gtuFSAwE/wVHu10PDTyjig9iRKAnEYaxvlv+J48R7f LMlh81P5daN4NDcjc439ih7yNTxUKIzoiZH3GZ4LWqQgCeJFoQpa5I2Je1B9s53SVpVF k8VQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z4si1454845iom.15.2019.05.03.08.16.20; Fri, 03 May 2019 08:16:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728256AbfECPOk (ORCPT + 99 others); Fri, 3 May 2019 11:14:40 -0400 Received: from mx2.suse.de ([195.135.220.15]:51814 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726267AbfECPOj (ORCPT ); Fri, 3 May 2019 11:14:39 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 71469ACFA; Fri, 3 May 2019 15:14:38 +0000 (UTC) Date: Fri, 3 May 2019 17:14:37 +0200 From: Petr Mladek To: Daniel Vetter Cc: DRI Development , Intel Graphics Development , Daniel Vetter , Peter Zijlstra , Ingo Molnar , Will Deacon , Sergey Senozhatsky , Steven Rostedt , John Ogness , linux-kernel@vger.kernel.org Subject: Re: [PATCH] RFC: console: hack up console_trylock more Message-ID: <20190503151437.dc2ty2mnddabrz4r@pathway.suse.cz> References: <20190502141643.21080-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190502141643.21080-1-daniel.vetter@ffwll.ch> User-Agent: NeoMutt/20170912 (1.9.0) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 2019-05-02 16:16:43, Daniel Vetter wrote: > console_trylock, called from within printk, can be called from pretty > much anywhere. Including try_to_wake_up. Note that this isn't common, > usually the box is in pretty bad shape at that point already. But it > really doesn't help when then lockdep jumps in and spams the logs, > potentially obscuring the real backtrace we're really interested in. > One case I've seen (slightly simplified backtrace): > > Call Trace: > > console_trylock+0xe/0x60 > vprintk_emit+0xf1/0x320 > printk+0x4d/0x69 > __warn_printk+0x46/0x90 > native_smp_send_reschedule+0x2f/0x40 > check_preempt_curr+0x81/0xa0 > ttwu_do_wakeup+0x14/0x220 > try_to_wake_up+0x218/0x5f0 > pollwake+0x6f/0x90 > credit_entropy_bits+0x204/0x310 > add_interrupt_randomness+0x18f/0x210 > handle_irq+0x67/0x160 > do_IRQ+0x5e/0x130 > common_interrupt+0xf/0xf > > > This alone isn't a problem, but the spinlock in the semaphore is also > still held while waking up waiters (up() -> __up() -> try_to_wake_up() > callchain), which then closes the runqueue vs. semaphore.lock loop, > and upsets lockdep, which issues a circular locking splat to dmesg. > Worse it upsets developers, since we don't want to spam dmesg with > clutter when the machine is dying already. > > Fix this by creating a __down_trylock which only trylocks the > semaphore.lock. This isn't correct in full generality, but good enough > for console_lock: > > - there's only ever one console_lock holder, we won't fail spuriously > because someone is doing a down() or up() while there's still room > (unlike other semaphores with count > 1). > > - console_unlock() has one massive retry loop, which will catch anyone > who races the trylock against the up(). This makes sure that no > printk lines will get lost. Making the trylock more racy therefore > has no further impact. To be honest, I do not see how this could solve the problem. The circular dependency is still there. If the new __down_trylock() succeeds then console_unlock() will get called in the same context and it will still need to call up() -> try_to_wake_up(). Note that there are many other console_lock() callers that might happen in parallel and might appear in the wait queue. Best Regards, Petr