Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp917775yba; Thu, 9 May 2019 07:57:46 -0700 (PDT) X-Google-Smtp-Source: APXvYqxCMqf9Bgfqer10qxb+HufR4J+xWJuXsLaSAeTzivIDUuNU8QK/Nl00oX5/+bqMgwC29+es X-Received: by 2002:a62:1690:: with SMTP id 138mr5711480pfw.28.1557413866003; Thu, 09 May 2019 07:57:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557413865; cv=none; d=google.com; s=arc-20160816; b=AldTzhLZOBtrr+0Ai20r7ID6qDWwi4+ktvFx88GKAMYSV5BbKHpiQcF5XUsCDMarwf bvLTlcWGt+AlkK0Cxkw6psxS0+cItLckMiRhL8Azr7V766J7UzjGh+9nB9VfmaZ9K4fE WrQ8OK3S8Cg/DRLvR8Eqql76D65vJhuDxRT9+GuOmKqhkotHFaeQnBHHduv8rW+zuFTI rxQHMyXWoO4tG7X50AiMROjh6lzSYj9MntMZTVvC11bFpbItTUZG8NvG9Q6JqGI3zXCp aGxmJCHhX5oOAuHFfvc9PUrA9AceaVCxUwhxHLH9CUP2cAHFoLTw7qIKkikfxUCwFMrc 1fug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=R8B58Aqaqvz9fezIiBB4EpQWFuJ2/hRd+E4zeIk58Ic=; b=Qa4rODK16HrtpBhZZwzkd+pypBvEJYZZ74QR+BISyEDhimNId1AqU8WDI9Y/sWZwV7 X7rBaDANge3GjdBVQuDpRuYjc9Pia35G3Lg6ttGatJGpa6AoeyvXLE/5U/7krmQPptZW pbvIuV8Zb2N78wO8loOKiUQmlhJQ6Ofa/MbjNPzhG8YPLiSmhgSOlULs4KMFhUZMofWS W6+OB8+Uonmvy8g4SbzAbifS6umlF/aXf30DQLafvReTkvMQgdi/yLvDGS2xUHXEyZU4 YtIz9yWsZ1wpgMMNSku5BIvGL9am3kkG4qdPR8Hb/v0itcVBgchWGk97wvI26JXgnRyC tMoA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l34si3189247pgb.574.2019.05.09.07.57.29; Thu, 09 May 2019 07:57:45 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726690AbfEIO4Y (ORCPT + 99 others); Thu, 9 May 2019 10:56:24 -0400 Received: from mx2.suse.de ([195.135.220.15]:56440 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726234AbfEIO4X (ORCPT ); Thu, 9 May 2019 10:56:23 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 60010AC9C; Thu, 9 May 2019 14:56:21 +0000 (UTC) Date: Thu, 9 May 2019 16:56:20 +0200 From: Petr Mladek To: Daniel Vetter Cc: Intel Graphics Development , DRI Development , Daniel Vetter , Peter Zijlstra , Ingo Molnar , Will Deacon , Sergey Senozhatsky , Steven Rostedt , John Ogness , Chris Wilson , linux-kernel@vger.kernel.org Subject: Re: [PATCH] RFC: console: hack up console_lock more v3 Message-ID: <20190509145620.2pjqko7copbxuzth@pathway.suse.cz> References: <20190509120903.28939-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190509120903.28939-1-daniel.vetter@ffwll.ch> User-Agent: NeoMutt/20170912 (1.9.0) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 2019-05-09 14:09:03, Daniel Vetter wrote: > console_trylock, called from within printk, can be called from pretty > much anywhere. Including try_to_wake_up. Note that this isn't common, > usually the box is in pretty bad shape at that point already. But it > really doesn't help when then lockdep jumps in and spams the logs, > potentially obscuring the real backtrace we're really interested in. > One case I've seen (slightly simplified backtrace): > > Call Trace: > > console_trylock+0xe/0x60 > vprintk_emit+0xf1/0x320 > printk+0x4d/0x69 > __warn_printk+0x46/0x90 > native_smp_send_reschedule+0x2f/0x40 > check_preempt_curr+0x81/0xa0 > ttwu_do_wakeup+0x14/0x220 > try_to_wake_up+0x218/0x5f0 > pollwake+0x6f/0x90 > credit_entropy_bits+0x204/0x310 > add_interrupt_randomness+0x18f/0x210 > handle_irq+0x67/0x160 > do_IRQ+0x5e/0x130 > common_interrupt+0xf/0xf > > > This alone isn't a problem, but the spinlock in the semaphore is also > still held while waking up waiters (up() -> __up() -> try_to_wake_up() > callchain), which then closes the runqueue vs. semaphore.lock loop, > and upsets lockdep, which issues a circular locking splat to dmesg. > Worse it upsets developers, since we don't want to spam dmesg with > clutter when the machine is dying already. > > Fix this by creating a prinkt_safe_up() which calls wake_up_process > outside of the spinlock. This isn't correct in full generality, but > good enough for console_lock: > > - console_lock doesn't use interruptible or killable or timeout down() > calls, hence an up() is the only thing that can wake up a process. > Hence the process can't get woken and killed and reaped while we try > to wake it up too. > > - semaphore.c always updates the waiter list while under the spinlock, > so there's no other races. Specifically another process that races > with a quick console_lock/unlock while we've dropped the spinlock > already won't see our own waiter. > > Note that we only have to break the recursion for the semaphore.lock > spinlock of the console_lock. Recursion within various scheduler > related locks is already prevented by the printk_safe_enter/exit pair > in __up_console_sem(). This is not fully true. printk_safe() helps only when the first try_to_wake_up() is called from printk_safe() context. > --- a/kernel/locking/semaphore.c > +++ b/kernel/locking/semaphore.c > @@ -197,6 +197,37 @@ struct semaphore_waiter { > bool up; > }; > > +/** > + * printk_safe_up - release the semaphore in console_unlock > + * @sem: the semaphore to release > + * > + * Release the semaphore. Unlike mutexes, up() may be called from any > + * context and even by tasks which have never called down(). > + * > + * NOTE: This is a special version of up() for console_unlock only. It is only > + * safe if there are no killable, interruptible or timing out down() calls. > + */ > +void printk_safe_up(struct semaphore *sem) > +{ > + unsigned long flags; > + struct semaphore_waiter *waiter = NULL; > + > + raw_spin_lock_irqsave(&sem->lock, flags); > + if (likely(list_empty(&sem->wait_list))) { > + sem->count++; > + } else { > + waiter = list_first_entry(&sem->wait_list, > + struct semaphore_waiter, list); > + list_del(&waiter->list); > + waiter->up = true; > + } > + raw_spin_unlock_irqrestore(&sem->lock, flags); > + > + if (waiter) /* protected by being sole wake source */ > + wake_up_process(waiter->task); I still do not see how this could help. Let's take the above backtrace as example: console_trylock+0xe/0x60 vprintk_emit+0xf1/0x320 printk+0x4d/0x69 __warn_printk+0x46/0x90 native_smp_send_reschedule +0x2f/0x40 check_preempt_curr+0x81/0xa0 ttwu_do_wakeup+0x14/0x220 try_to_wake_up+0x218/0x5f0 pollwake+0x6f/0x90 credit_entropy_bits+0x204/0x310 add_interrupt_randomness+0x18f/0x210 handle_irq+0x67/0x160 do_IRQ+0x5e/0x130 common_interrupt+0xf/0xf We have the following chain of calls: + do_IRQ() ... + try_to_wake_up() # takes p->pi_lock + ttwu_remote() # takes rq lock + ttwu_do_wakeup() + check_preempt_curr() + native_smp_send_reschedule() + __warn_printk() + printk() + vprintk_emit() + console_trylock() # success + console_unlock() + up_console_sem() + up() # wait list in not empty + __up() + wake_up_process() + try_to_wake_up() !BANG! Deadlock on p->pi_lock. It does not matter if the nested try_to_wake_up() was called under sem->lock or outside. By other words. The patch removed one lockdep warning. But it just just delayed the deadlock. It will not happen on sem->lock but later on p->pi_lock. I am repeating myself. But IMHO, the only solution is to introduce printk deferred context and use it in WARN_DEFERRED(). Best Regards, Petr