Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp2214513yba; Mon, 6 May 2019 01:41:54 -0700 (PDT) X-Google-Smtp-Source: APXvYqzp2N7DD3cgCqwhzPnponEI8s6RLaShjmEqR11PyJfcUrlBjCQ6x/Cbt/QteHuHAzaMCu/W X-Received: by 2002:a63:b48:: with SMTP id a8mr28935118pgl.368.1557132114260; Mon, 06 May 2019 01:41:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557132114; cv=none; d=google.com; s=arc-20160816; b=XHE8kyMQfZWlfozjq4xEWZCP6rSmOAWgH8181JpWodW6Pj1lg+2x5EDqIEaBvS9nXN 7td0u1hZSV0SKIrw+85E+AvB0RuYcPseYg5ebe+2VOXOvRVGwMwbBB0OrxeTcrfterEi wbIBVzZv3jVs4N673VCFaDhnPqdxNEV//T0ks1RmO9wzjoiM+6NDov/myKLCKxfHl+1r kS49zoheUnFFNIbyAHoUEYj+EmgVVSIrcAHb+eF4zNgnsk74qYYtMZTh8GaF6azULJem Kk2G753MLSmW4T2xDfVz6iiVZ9XeHgU7YQHWRpzj0Zisz2IMJDSfwETZjTGJVL/1wynK Ec2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=E8NoodJz3Tyk7K1SdtPgr1wSQtdEp94KgYEW4R597E8=; b=gDhwTeTGaCJlxFIj/aj2t0jWOPTW6YLB7qiLiI9Yzfl6338dwr3QPphBEyRdqFtzbK 3OSsd4g7bHklC/6GfQQZ1MZUOR5juAZ+MiKgmBI+wKiI/l7NqgZT6gOlnsqzi6U+aMwI 8zteyzKQoUQZeFAmmH5D+7zufrW1Z7sHlqYNn7qmi235617rRR/LLCT++QyoBR59/gK1 8AGVsSASoPpNl59FIvDVwFl1ADeQRroLLxSeP0MpaYZogO/kFQcznKA15UY+TD5Fwzgp 2HnwqSYxt5X2QZGeK/eIxhiea9PXb1eOdDbNaB2lt2tqwgrvtJWY1or/9cLKQ3MOtkRo 7TfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=jATk1C77; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z6si11259688pgp.35.2019.05.06.01.41.35; Mon, 06 May 2019 01:41:54 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=jATk1C77; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726181AbfEFIkp (ORCPT + 99 others); Mon, 6 May 2019 04:40:45 -0400 Received: from mail-it1-f193.google.com ([209.85.166.193]:35039 "EHLO mail-it1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725851AbfEFIko (ORCPT ); Mon, 6 May 2019 04:40:44 -0400 Received: by mail-it1-f193.google.com with SMTP id l140so18700388itb.0 for ; Mon, 06 May 2019 01:40:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=E8NoodJz3Tyk7K1SdtPgr1wSQtdEp94KgYEW4R597E8=; b=jATk1C77darD17jedT+v35FcfAnUSaDwzjZZMyFA6RNBWBR9GLFh6MrKGAek6gaqjP WJ6ii/Acr4HRe2NQUDuP85bYPhp8GwKEYg56vXNOdJkQDmY3CEKYycFKA6tw8u0AT2ux zeimwvmgvSKlmrHPQd37Opoq5rmItsMxTM48w= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=E8NoodJz3Tyk7K1SdtPgr1wSQtdEp94KgYEW4R597E8=; b=FREEtntib40Rm4X/s2wJ2ZAgiWJ38xcTW7B5p08zx6c0y9gQltG4UNv0IP65DG45pg uxqi3nmICOL41Snt3Y8m3rcai0hTBJYxU0uNryBYkMpeX/7h250cCWwt/LcXUskF7bFR LIZnvRyWr/dvQ5vNTFS4LDh11tbG/+9xhGJhpeLciVBobCL+bmhF9H6qKp3yrnU0PySo pz1QmQyNR16WYQ2gN1mu2uMidMm5UZWClUZszxDYE1d/dMHV79nXBBXQFao/ezPls9Uw HHk/oc6XXFqjNhL8JGG2VWZoOVB2bGE17Mdm5wJHUOLBqjgP0nPlk3Wae47jC+U4cxfC jfqw== X-Gm-Message-State: APjAAAXy8OPJ5aNkT2zWEOkEDVDr2g1f0CSYfCoz7kVJXI9dBGTOBwS3 h96yG9n0yA7R1BtlsU3xRKU4nAs1wh3S9+rj3CI8iQ== X-Received: by 2002:a05:660c:4d0:: with SMTP id v16mr4955544itk.62.1557132043602; Mon, 06 May 2019 01:40:43 -0700 (PDT) MIME-Version: 1.0 References: <20190502141643.21080-1-daniel.vetter@ffwll.ch> <20190503151437.dc2ty2mnddabrz4r@pathway.suse.cz> <20190506074809.huawsdaynyci5kwz@pathway.suse.cz> In-Reply-To: <20190506074809.huawsdaynyci5kwz@pathway.suse.cz> From: Daniel Vetter Date: Mon, 6 May 2019 10:40:32 +0200 Message-ID: Subject: Re: [PATCH] RFC: console: hack up console_trylock more To: Petr Mladek Cc: DRI Development , Intel Graphics Development , Daniel Vetter , Peter Zijlstra , Ingo Molnar , Will Deacon , Sergey Senozhatsky , Steven Rostedt , John Ogness , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 6, 2019 at 9:48 AM Petr Mladek wrote: > On Mon 2019-05-06 09:11:37, Daniel Vetter wrote: > > On Fri, May 3, 2019 at 5:14 PM Petr Mladek wrote: > > > On Thu 2019-05-02 16:16:43, Daniel Vetter wrote: > > > > console_trylock, called from within printk, can be called from pretty > > > > much anywhere. Including try_to_wake_up. Note that this isn't common, > > > > usually the box is in pretty bad shape at that point already. But it > > > > really doesn't help when then lockdep jumps in and spams the logs, > > > > potentially obscuring the real backtrace we're really interested in. > > > > One case I've seen (slightly simplified backtrace): > > > > > > > > Call Trace: > > > > > > > > console_trylock+0xe/0x60 > > > > vprintk_emit+0xf1/0x320 > > > > printk+0x4d/0x69 > > > > __warn_printk+0x46/0x90 > > > > native_smp_send_reschedule+0x2f/0x40 > > > > check_preempt_curr+0x81/0xa0 > > > > ttwu_do_wakeup+0x14/0x220 > > > > try_to_wake_up+0x218/0x5f0 > > > > pollwake+0x6f/0x90 > > > > credit_entropy_bits+0x204/0x310 > > > > add_interrupt_randomness+0x18f/0x210 > > > > handle_irq+0x67/0x160 > > > > do_IRQ+0x5e/0x130 > > > > common_interrupt+0xf/0xf > > > > > > > > > > > > This alone isn't a problem, but the spinlock in the semaphore is also > > > > still held while waking up waiters (up() -> __up() -> try_to_wake_up() > > > > callchain), which then closes the runqueue vs. semaphore.lock loop, > > > > and upsets lockdep, which issues a circular locking splat to dmesg. > > > > Worse it upsets developers, since we don't want to spam dmesg with > > > > clutter when the machine is dying already. > > > > > > > > Fix this by creating a __down_trylock which only trylocks the > > > > semaphore.lock. This isn't correct in full generality, but good enough > > > > for console_lock: > > > > > > > > - there's only ever one console_lock holder, we won't fail spuriously > > > > because someone is doing a down() or up() while there's still room > > > > (unlike other semaphores with count > 1). > > > > > > > > - console_unlock() has one massive retry loop, which will catch anyone > > > > who races the trylock against the up(). This makes sure that no > > > > printk lines will get lost. Making the trylock more racy therefore > > > > has no further impact. > > > > > > To be honest, I do not see how this could solve the problem. > > > > > > The circular dependency is still there. If the new __down_trylock() > > > succeeds then console_unlock() will get called in the same context > > > and it will still need to call up() -> try_to_wake_up(). > > > > > > Note that there are many other console_lock() callers that might > > > happen in parallel and might appear in the wait queue. > > > > Hm right. It's very rare we hit this in our CI and I don't know how to > > repro otherwise, so just threw this out at the wall to see if it > > sticks. I'll try and come up with a new trick then. > > Single messages are printed from scheduler via printk_deferred(). > WARN() might be solved by introducing printk deferred context, > see the per-cpu variable printk_context. I convinced myself that I can take the wake_up_process out from under the spinlock, for the limited case of the console lock. I think that's a cleaner and more robust fix than leaking printk_context trickery into the console_unlock code. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch