Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp4011203yba; Tue, 7 May 2019 10:36:09 -0700 (PDT) X-Google-Smtp-Source: APXvYqyNGNzIfwHNLDRszBV9PUVYLZlxAPQMeqQx/hrKzcfpZzEDu1ybWMdGVoYKxsbJifoy5cHp X-Received: by 2002:a17:902:f208:: with SMTP id gn8mr2052550plb.312.1557250569634; Tue, 07 May 2019 10:36:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557250569; cv=none; d=google.com; s=arc-20160816; b=0VEyaD2XRE4Tgradclh6q3zt79J3jw4ikt72QwdopBwqVNvUvhyixfDdq44NyLI3IC ISlfeQ/jrNXfI+c3iA7CsNiTHRpvdoYqrQUf8FmFikFkzinRwFdwBFogjM/nyAvHSZOZ nod4vvem0Vjqy2nP3G1rAsOV/UkUFy0ns0Uyt1/dCbLUV2LtpjmiN1+heFoZdKCe1exC hqdsrHE4adUT2THODzLm+7AHLcRJI3742hQWQkNn1CLN0zxHxskV3bD0YEcAVoNAA9Lf 8JEUv5xgcas4M3oJ8xNxlnka9n4r+pN72r6HlvOzu+eNNLtDWRWkdYaH/q9UMQIQU3wU Xhsg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=By/YTXwruc5w+T+z2pt0ueyq600FZNbXB/6wgpzCllU=; b=vJ3gcTYuhd9F5RqZB+4DevX6+Wyfss0km3stE6nCNI2cAgYEDQJOMLZn4yo+DIdraO Cfd6fRTDv7c7KMnb6bl3pGOzItkgr1qy3MjimAc0sDPlEdhmKlN680APjM3PwwVLAYGd aqSZpSDCj+b+C7F8ciFxK0A0I3VjEqTVSL4rpkQaB46acbGJK4lOakGZ8k3Oy8d+wcVX 0SueynyfvK/Fd7Kae+vlrPBVdBIS7qwwCbepMJqI9kbMKV78nSBhcyniGsoaxeyJv/1q Q6Jnv0enzIbgdQFSIWSduKuGpJSCrhsCgnCM2BWa2wwhM6DLU/RhzEPxap3/kTHDt4MY 6HSA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=IRecT2OK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e14si19397521pgh.270.2019.05.07.10.35.53; Tue, 07 May 2019 10:36:09 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=IRecT2OK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727231AbfEGRdi (ORCPT + 99 others); Tue, 7 May 2019 13:33:38 -0400 Received: from mail-ed1-f68.google.com ([209.85.208.68]:36754 "EHLO mail-ed1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726256AbfEGRdh (ORCPT ); Tue, 7 May 2019 13:33:37 -0400 Received: by mail-ed1-f68.google.com with SMTP id a8so19517475edx.3 for ; Tue, 07 May 2019 10:33:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=By/YTXwruc5w+T+z2pt0ueyq600FZNbXB/6wgpzCllU=; b=IRecT2OKkFRfFRAye0IddpuN/wdrWMSDOiynnDPnjXt+OWBDhneVqcyps+1EqtnwgT fN4V1Ew1Op754Jx9OKCQdNdJWJcGkSbHNu7qqLhW0O+x9DWI7fXcFc37xfCiEw+vjMKp NEv4eY8gAUDK+kSzUpb5MkeVLzQERFgu4sTxM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=By/YTXwruc5w+T+z2pt0ueyq600FZNbXB/6wgpzCllU=; b=dVoIWMNBJtWYgJoO84ePWFJqXGK6mxFNW7jBKwW74rrirXhkUS7z5g9CXOYen2OdYT dE7nVy2iJxWHdrcbhaxiEeei2K32OFWl5QqvaY/TcWq9JKdfGHzG0h8co+qit661S7V0 OifrMJwqvzb+Oh/nghlZJXEKgIPTJRj3ILilTTHJn+uQ2G0arFM4uGY+WyLhRytWn2Rk czl2djUEDBO15u7yF0th70nwk2Y3hndhpsc+Yta/17kEugMJkwW4u3oxTtCe/Affs8WQ x6ogdCRrhJZmAqXg21ZG5aXRgFsQG7k2rs9IshMwHqZZ6ikZpMRKJN/Dl2ymIxXjGFb/ 61pQ== X-Gm-Message-State: APjAAAXgDWxRkkN6eg7qY6J4Fdxu57Sl6ofj8IhTAtnVMyv43Jogaqyu 7qWWmLidWkIxg9X3wLrNwSeMgQ== X-Received: by 2002:a17:906:138e:: with SMTP id f14mr9414382ejc.279.1557250415860; Tue, 07 May 2019 10:33:35 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:569e:0:3106:d637:d723:e855]) by smtp.gmail.com with ESMTPSA id p13sm2245429ejj.2.2019.05.07.10.33.34 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 07 May 2019 10:33:35 -0700 (PDT) From: Daniel Vetter To: Intel Graphics Development Cc: Daniel Vetter , Daniel Vetter , Peter Zijlstra , Ingo Molnar , Will Deacon , Petr Mladek , Sergey Senozhatsky , Steven Rostedt , John Ogness , linux-kernel@vger.kernel.org, Nicolai Stange , Thomas Gleixner Subject: [PATCH] RFC: x86/smp: use printk_deferred in native_smp_send_reschedule Date: Tue, 7 May 2019 19:33:29 +0200 Message-Id: <20190507173329.17031-1-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org console_trylock, called from within printk, can be called from pretty much anywhere. Including try_to_wake_up. Note that this isn't common, usually the box is in pretty bad shape at that point already. But it really doesn't help when then lockdep jumps in and spams the logs, potentially obscuring the real backtrace we're really interested in. One case I've seen (slightly simplified backtrace): Call Trace: console_trylock+0xe/0x60 vprintk_emit+0xf1/0x320 printk+0x4d/0x69 __warn_printk+0x46/0x90 native_smp_send_reschedule+0x2f/0x40 check_preempt_curr+0x81/0xa0 ttwu_do_wakeup+0x14/0x220 try_to_wake_up+0x218/0x5f0 pollwake+0x6f/0x90 credit_entropy_bits+0x204/0x310 add_interrupt_randomness+0x18f/0x210 handle_irq+0x67/0x160 do_IRQ+0x5e/0x130 common_interrupt+0xf/0xf This alone isn't a problem, but the spinlock in the semaphore is also still held while waking up waiters (up() -> __up() -> try_to_wake_up() callchain), which then closes the runqueue vs. semaphore.lock loop, and upsets lockdep, which issues a circular locking splat to dmesg. Worse it upsets developers, since we don't want to spam dmesg with clutter when the machine is dying already. This is fix attempt number 3, we've already tried to: - make the console_trylock trylock also the spinlock. This works in the limited case of the console_lock use-case, but doesn't fix the same semaphore.lock acquisition in the up() path in console_unlock, which we can't avoid with a trylock. - move the wake_up_process in up() out from under the semaphore.lock spinlock critical section. Again this works for the limited case of the console_lock, and does fully break the cycle for this lock. Unfortunately there's still plenty of scheduler related locks that wake_up_process needs, so the loop is still there, just with a few less locks involved. Hence now third attempt, trying to fix this by using printk_deferred() instead of the normal printk that WARN() uses. native_smp_send_reschedule is only called from scheduler related code, which has to use printk_deferred due to this locking recursion, so this seems consistent. It has the unfortunate downside that we're losing the backtrace though (I didn't find a printk_deferred version of WARN, and I'm not sure it's a bright idea to dump that much using printk_deferred.) Keeping all the people from the console_lock/printk related attempts on cc as fyi. Note: We can only hit this in our CI, with a very low reproduction rate. And right now the lockdep splat and a few other things crowd out what actually happens in the little bit of dmesg we recover, so no idea yet why exactly we're hitting that WARN(). Signed-off-by: Daniel Vetter Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Will Deacon Cc: Petr Mladek Cc: Sergey Senozhatsky Cc: Steven Rostedt Cc: Daniel Vetter Cc: John Ogness Cc: linux-kernel@vger.kernel.org Cc: Nicolai Stange Cc: Thomas Gleixner --- arch/x86/kernel/smp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index 04adc8d60aed..f19782786669 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -125,7 +125,8 @@ static bool smp_no_nmi_ipi = false; static void native_smp_send_reschedule(int cpu) { if (unlikely(cpu_is_offline(cpu))) { - WARN(1, "sched: Unexpected reschedule of offline CPU#%d!\n", cpu); + printk_deferred(KERN_WARNING + "sched: Unexpected reschedule of offline CPU#%d!\n", cpu); return; } apic->send_IPI(cpu, RESCHEDULE_VECTOR); -- 2.20.1