Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp934286yba; Thu, 9 May 2019 08:10:16 -0700 (PDT) X-Google-Smtp-Source: APXvYqwvmrWI3NlVlC6Cs0ZpgVPUBmpycBqZvTJSgFIhDzbCxtySLad/2V+J05bO374FX4lOXXmY X-Received: by 2002:a62:575b:: with SMTP id l88mr5820366pfb.143.1557414616687; Thu, 09 May 2019 08:10:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557414616; cv=none; d=google.com; s=arc-20160816; b=zbZSHlerO8aWDl1JBOVCPaa0px91UnjknheARp90y4KEPX15rN/4RKAVKIkkpsCvjq jHCTuxG1zDMPfBGlN4pYwFNXhikLm2nmzVakTe5nVNS/qK5ABKEt/cpDfVyG5V4k7Ujt VnHDDAgjrHqfl1RxPVHXfBarL7SDVeBTohnZTBOWddC4OTV0B+skEaMh5iuJqXxVfU3+ D4jvgjd330P/zFv5W2yQq/1HCi7Avc6KpUIR2CmTF06or9dQc37cBdgykqiQZEzspoIZ UTXvzyIFkLpgxOY+NijL1E57eg1lczXeKPfdNApvmUFwraLurnQZupRvf8ZWTpinwtsR +fDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:to :from:date; bh=PvYvBIRgAZGJt5LTP+X6lD+mjJ/txUrnN/Psez+BQMI=; b=xpUBbZKHVRasIaqxySfvkbLxpUnrZSOuyrZ5rrnjpCwVOBRTD/putYvcx9zZC1ARKZ uz2Vz1XiKYxmOapoPE0WHUYGQjZfWtdbcnI8rv07k/RYmO+uh6aPXDfaVF9+a3u1IBak tWX56hWOWqASaDegnuo6/SIb5BdiGKIyG3JCG5ewirQYwhWgkFv6AZrcAWgXANZNeiaL kpsaBTvND5kOBS+p5y1Yk5fDxxz2dIbGYDITfwo5kssioPap1LV5um/OQ+beyZ8hEKgj F2Zq4x8pCz7r3NZ436eEkDL6GJz8OLXEM/U9Wux8HoGRXVv5kN1jcfq/Phmtl4sAeg+1 A7Rg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 76si3721261pgf.582.2019.05.09.08.09.58; Thu, 09 May 2019 08:10:16 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726631AbfEIPIv (ORCPT + 99 others); Thu, 9 May 2019 11:08:51 -0400 Received: from mx2.suse.de ([195.135.220.15]:33900 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726234AbfEIPIu (ORCPT ); Thu, 9 May 2019 11:08:50 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id AF88EAC58; Thu, 9 May 2019 15:08:48 +0000 (UTC) Date: Thu, 9 May 2019 17:08:48 +0200 From: Petr Mladek To: Intel Graphics Development , Daniel Vetter , Peter Zijlstra , Ingo Molnar , Will Deacon , Sergey Senozhatsky , Steven Rostedt , John Ogness , linux-kernel@vger.kernel.org Subject: Re: [Intel-gfx] [PATCH] RFC: console: hack up console_lock more v2 Message-ID: <20190509150848.dj5kb47iuz5tcmso@pathway.suse.cz> References: <20190502141643.21080-1-daniel.vetter@ffwll.ch> <20190506074553.21464-1-daniel.vetter@ffwll.ch> <20190506081614.b7b22k4prodskbiy@pathway.suse.cz> <20190506082628.wehkislebljxmk5d@pathway.suse.cz> <20190506093812.GG17751@phenom.ffwll.local> <20190506112448.rng2oefmp2c262dq@pathway.suse.cz> <20190508081712.GQ17751@phenom.ffwll.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190508081712.GQ17751@phenom.ffwll.local> User-Agent: NeoMutt/20170912 (1.9.0) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 2019-05-08 10:17:12, Daniel Vetter wrote: > On Mon, May 06, 2019 at 01:24:48PM +0200, Petr Mladek wrote: > > On Mon 2019-05-06 11:38:13, Daniel Vetter wrote: > > > On Mon, May 06, 2019 at 10:26:28AM +0200, Petr Mladek wrote: > > > > On Mon 2019-05-06 10:16:14, Petr Mladek wrote: > > > > > On Mon 2019-05-06 09:45:53, Daniel Vetter wrote: > > > > > > console_trylock, called from within printk, can be called from pretty > > > > > > much anywhere. Including try_to_wake_up. Note that this isn't common, > > > > > > usually the box is in pretty bad shape at that point already. But it > > > > > > really doesn't help when then lockdep jumps in and spams the logs, > > > > > > potentially obscuring the real backtrace we're really interested in. > > > > > > One case I've seen (slightly simplified backtrace): > > > > > > > > > > > > Call Trace: > > > > > > > > > > > > console_trylock+0xe/0x60 > > > > > > vprintk_emit+0xf1/0x320 > > > > > > printk+0x4d/0x69 > > > > > > __warn_printk+0x46/0x90 > > > > > > native_smp_send_reschedule+0x2f/0x40 > > > > > > check_preempt_curr+0x81/0xa0 > > > > > > ttwu_do_wakeup+0x14/0x220 > > > > > > try_to_wake_up+0x218/0x5f0 > > > > > > > > > > try_to_wake_up() takes p->pi_lock. It could deadlock because it > > > > > can get called recursively from printk_safe_up(). > > > > > > > > > > And there are more locks taken from try_to_wake_up(), for example, > > > > > __task_rq_lock() taken from ttwu_remote(). > > > > > > > > > > IMHO, the most reliable solution would be do call the entire > > > > > up_console_sem() from printk deferred context. We could assign > > > > > few bytes for this context in the per-CPU printk_deferred > > > > > variable. > > > > > > > > Ah, I was too fast and did the same mistake. This won't help because > > > > it would still call try_to_wake_up() recursively. > > > > > > Uh :-/ > > > > > > > We need to call all printk's that can be called under locks > > > > taken in try_to_wake_up() path in printk deferred context. > > > > Unfortunately it is whack a mole approach. > > > > > > Hm since it's whack-a-mole anyway, what about converting the WARN_ON into > > > a prinkt_deferred, like all the other scheduler related code? Feels a > > > notch more consistent to me than leaking the printk_context into areas it > > > wasn't really meant built for. Scheduler code already fully subscribed to > > > the whack-a-mole approach after all. > > > > I am not sure how exactly you mean the conversion. > > > > Anyway, we do not want to use printk_deferred() treewide. It reduces > > the chance that the messages reach consoles. Scheduler is an > > exception because of the possible deadlocks. > > > > A solution would be to define WARN_ON_DEFERRED() that would > > call normal WARN_ON() in printk deferred context and > > use in scheduler. > > Sent it out, and then Sergey pointed out printk_safe_enter/exit (which I > guess is what you meant, and which I missed) No, I meant introducing a deferred printk context that would behave like printk_deferred(). printk safe context is more limiting. It prevents deadlock on logbuf_lock by temporary saving the messages into per-CPU buffers. In scheduler, we could store the messages directly into the main log buffer. We just need to deffer the console handling to avoid deadlock on the scheduler locks. > , but we're doing this already around the up() call > in __up_console_sem. > > So I think these further recursions you're pointed out are already handled > correctly, and all we need to do is to break the loop involving > semaphore.lock of the console_lock semaphore only. Which I think this > patch here achieves. printk safe context would help only when try_to_wake_up() and all other functions using the same locks _all over the system_ are called printk safe (or deferred) context. By other words, printk safe context solves only printk() recursion. It does not solve recursion of the scheduler operations. Best Regards, Petr