Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp4563431imm; Mon, 18 Jun 2018 17:55:41 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLv6z2JCjy/HLigynKVqFt0P6ptKTcROFmtp4ILq7VcZiipkR3ArY2nA6bdIQ2kNHMRHoXH X-Received: by 2002:a17:902:aa98:: with SMTP id d24-v6mr16687707plr.185.1529369741304; Mon, 18 Jun 2018 17:55:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529369741; cv=none; d=google.com; s=arc-20160816; b=oFxat7Qt7TRRebojWlhxg++QA5ww/6gJPq9oM8EmmPPPmcZXBlyP58ap9dKu55juZc Fzp4tf9QRqoPBcaCXHyfz/Ik3CvoB78W68oc99sN8pyIYLYYWtuGpSFDBaVZhJ5okaYr CYwtsOsR0Bl4E1Aw36LGJkWZ/InOwVHE5LrZGuXwR9sOsZxEZdNeShyuyOj32thq6J4i zOrpi2UYjdr/tapax5do4bf3912i50x6jWFYCS0S5chYrCv44jaaCvnANT9PWqdjUPl9 aLMyNB4KEXpz0JTC+eAWbno9AOLa/XOr1SERYcSKYI3EXb+ItcLeoAmzjmF8p56FP4Jh tdeA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=VsoFhXMw09AO7XYWBrCsA6oMZaJw/cdYYkFkjNp7uaw=; b=LV1SYxN3QBkGp8otOaRpOYAjaGedcuDtBbl/MzE28r5EaCXUl8kDQUnQX9K9wonHKK VCALAALaCdiun1GSnpRrUZ1TOqEjWF+OQXMaDwp25GZZsk+KR3yBbIEIHwjJJZaTk7E4 exgphkE9ffcFQogakKPz0GWimNZOo6a3vvD3wuFeRACYwu6RabYKBUDYXvvswhZDzeNa 2YUSa0nuSvtptzl3/E4Dzpuf+6f6/XDMuaDcIuTr44LBjKK/1/3EGpWXULiWP4LVJ519 be9/uJt1w5dBLckGfgrFPoAMe5o+OtJJphjlMHVRauHgbtXME16e7pDaSPH5kSphOkim TKzQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="ke6aiY/C"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 32-v6si16292936ple.447.2018.06.18.17.55.27; Mon, 18 Jun 2018 17:55:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="ke6aiY/C"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S937171AbeFSAxR (ORCPT + 99 others); Mon, 18 Jun 2018 20:53:17 -0400 Received: from mail-pf0-f194.google.com ([209.85.192.194]:36889 "EHLO mail-pf0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S937115AbeFSAxN (ORCPT ); Mon, 18 Jun 2018 20:53:13 -0400 Received: by mail-pf0-f194.google.com with SMTP id y5-v6so9030539pfn.4; Mon, 18 Jun 2018 17:53:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=VsoFhXMw09AO7XYWBrCsA6oMZaJw/cdYYkFkjNp7uaw=; b=ke6aiY/C+cU5c1H+HQVoRIYu6bK/jFOsZknWU1gHd1ZtdNg/zbsxktJhJoGSDV2vIU 7xaVD9qOc/UF6iNJDgJsG38QFZRxVBCsfEejufoPdKRP8tBuZ/dU3dEQyOWeMxQ0u+BC E/H5DyCPSBgmNrGemBywFzSMsgbKzASDWyJDUVdmjyc4fEyk0QPwtZbG2JWk9wcKrxeb ltmER4qgmnNrwf9kRDJNh3YLW+7Ow8/6Icnm/1zrBj1uOLYK8EzPt6OCX2fNJCH1e9L1 9lKodu/IAXCH+jknKFBVhjbzQQHhoRAe9zyp0zmJYJa0/j6M7XqsQkHsIHbwBk94L4lO H6pA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=VsoFhXMw09AO7XYWBrCsA6oMZaJw/cdYYkFkjNp7uaw=; b=o4cTfe6UMpgTyhtSVfN3WOfycKiKrwmn4pPxwzCtd4cvyKyJ38ru8cAWKqTS2sSbyK FY0BttmRKDgrSzTeLrPofuPRQ8Q3q1ZJ9Kc6CHBgnTiljSwBU3Eh3GouVVi3nvgRlbHb dsL+OMazDYJukOKRTE7fETsT+Mu+6dXHuYoRuji96ou5twYzw8brpmdMsfrFyWzH4MjY KMapFFgGrXYENdrgTGf5m4H+FsEvNtBIlo1wvhVFLnWGDnuALn08lq23qDtgKcf8XosB ZLa+HNM73Lb5thH5ZsRq4eQR2FL8H4bd6zQT977Zo4ITynoDA3vkDY3/E838rg67TvBE JaqA== X-Gm-Message-State: APt69E16ekCNSKC7oPRfoiZRi8J9mbQ1PGrsU55tOguk6uI6akIftRH6 D5CdPP2e5MFzN3l50qH/deE= X-Received: by 2002:a63:7f15:: with SMTP id a21-v6mr12861107pgd.21.1529369592917; Mon, 18 Jun 2018 17:53:12 -0700 (PDT) Received: from localhost ([175.223.26.106]) by smtp.gmail.com with ESMTPSA id p22-v6sm34961337pfj.166.2018.06.18.17.53.10 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Jun 2018 17:53:11 -0700 (PDT) Date: Tue, 19 Jun 2018 09:53:08 +0900 From: Sergey Senozhatsky To: Alan Cox Cc: Sergey Senozhatsky , Petr Mladek , Steven Rostedt , Greg Kroah-Hartman , Jiri Slaby , Linus Torvalds , Peter Zijlstra , Andrew Morton , Dmitry Vyukov , linux-kernel@vger.kernel.org, linux-serial@vger.kernel.org, Sergey Senozhatsky Subject: Re: [RFC][PATCH 0/6] Use printk_safe context for TTY and UART port locks Message-ID: <20180619005308.GA405@jagdpanzerIV> References: <20180615093919.559-1-sergey.senozhatsky@gmail.com> <20180618143818.50b2f2f9@alans-desktop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180618143818.50b2f2f9@alans-desktop> User-Agent: Mutt/1.10.0 (2018-05-17) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Thanks for taking a look! On (06/18/18 14:38), Alan Cox wrote: > > It doesn't come as a surprise that recursive printk() calls are not the > > only way for us to deadlock in printk() and we still have a whole bunch > > of other printk() deadlock scenarios. For instance, those that involve > > TTY port->lock spin_lock and UART port->lock spin_lock. > > The tty layer code there is not re-entrant. Nor is it supposed to be Could be. But at least we have circular locking dependency in tty, see [1] for more details: tty_port->lock => uart_port->lock CPU0 tty spin_lock(&tty_port->lock) printk() call_console_drivers() foo_console_write() spin_lock(&uart_port->lock) Whereas we normally have uart_port->lock => tty_port->lock CPU1 IRQ foo_console_handle_IRQ() spin_lock(&uart_port->lock) tty spin_lock(&tty_port->lock) If we switch to printk_safe when we take tty_port->lock then we remove the printk->uart_port chain from the picture. > > So the idea of this patch set is to take tty_port->lock and > > uart_port->lock from printk_safe context and to eliminate some > > of non-recursive printk() deadlocks - the ones that don't start > > in printk(), but involve console related locks and thus eventually > > deadlock us in printk(). For this purpose the patch set introduces > > several helper macros: > > I don't see how this helps - if you recurse into the uart code you are > still hitting the paths that are unsafe when re-entered. All you've done > is messed up a pile of locking code on critical performance paths. > > As it stands I think it's a bad idea. The only new thing is that we inc/dec per-CPU printk context variable when we lock/unlock tty/uart port lock: printk_safe_enter() -> this_cpu_inc(printk_context); printk_safe_exit() -> this_cpu_dec(printk_context); How does this help? Suppose we have the following IRQ foo_console_handle_IRQ() spin_lock(&uart_port->lock) uart_write_wakeup() tty_port_tty_wakeup() tty_port_default_wakeup() printk() call_console_drivers() foo_console_write() spin_lock(&uart_port->lock) << deadlock If we take uart_port lock from printk_safe context, we remove the printk->call_console_drivers->foo_console_write->spin_lock chain. Because printk() output will endup in a per-CPU buffer, which will be flushed later from irq_work. So the whole thing becomes: IRQ foo_console_handle_IRQ() printk_safe_enter() spin_lock(&uart_port->lock) uart_write_wakeup() tty_port_tty_wakeup() tty_port_default_wakeup() printk() << we don't re-enter foo_console_driver << from printk() anymore printk_safe_log_store() irq_work_queue spin_unlock(&uart_port->lock) printk_safe_exit() iret #flush per-CPU buffer IRQ printk_safe_flush_buffer() vprintk_deferred() > > Of course, TTY and UART port spin_locks are not the only locks that > > we can deadlock on. So this patch set does not address all deadlock > > scenarios, it just makes a small step forward. > > > > Any opinions? > > The cure is worse than the disease. Because of this_cpu_inc(printk_context) / this_cpu_dec(printk_context)? May be. That's why I put RFC :) > The only case that's worth looking at is the direct polled console code > paths. The moment you touch the other layers you add essentially never > needed code to hot paths. > > Given printk nowdays is already somewhat unreliable with all the perf > related changes, and we have other good debug tools I think it would be > far cleaner to have some kind of > > > if (spin_trylock(...)) { > console_defer(buffer); > return; > } > > helper layer in the printk/console logic, at least for the non panic/oops > cases. spin_trylock() in every ->foo_console_write() callback? This still will not address the reported deadlock [1]. [1] lkml.kernel.org/r/000000000000d557e7056e1c7a01@google.com -ss