Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp709654imm; Thu, 4 Oct 2018 01:55:56 -0700 (PDT) X-Google-Smtp-Source: ACcGV63ZiM167lzg8OAXMoq4k5x6FBoo8E+5/lf8ErLTH3G4E5/ApPn98mDK0dkMTFMSb4ji7Ao8 X-Received: by 2002:a17:902:3a2:: with SMTP id d31-v6mr5592941pld.287.1538643356270; Thu, 04 Oct 2018 01:55:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538643356; cv=none; d=google.com; s=arc-20160816; b=L22NWSdf0DZblA/ynSzlgJwd6ovgtKvcMM7RHYubwdw/wKna+fzGyALpsPFjQvLhPA zoVbT/qZjGZpC+Mvx5YOy4jYQ28694N+qEJNM9yrsXv543lCz//qseDuFGFQy0ot1jg8 +0nLP3Xutzc2ppblNwt2CbAkM4NrDhKFxznI7LOXEsV2+kuD70LqSp6+HY5B6pxUYPbZ KZcTkjqkIuym1DoCA/6cPPb8LFWNk8RQdUZXGiarT9fmnVz/+gDDz8EZJpAIo7vqaR1Q n+pMV3MSqwijYeSb2HUvTCyT2WMKjzsbzrF4re6Gq095pN9h9X37cA9SUZD+cA9MAGBx QeiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=gh3ECq3VaZvkexK5JHWw6C3ee8+sG3qrVox2UXqj1oA=; b=jrmVT56GKttBzMMQXebdYVhy9T1a5k8fMPwRPRDjvpOnVfzQMAOC3wLx61Ln0QuNvR Bbx9lCirzqZQQKSq2L1TeXgakfl72EhwU/Zz+PcAJDlQ1iqXLaT3PNFG3XB1Sieo8eyW 2iN4yDqWPFUuKNupUr734ny9fayT4e6AL3MH5JsCXYCgqtIvizUTfkBf+J28/EhDlVFt +gdG8+RUxtWSI6RybuGaCKz66T5LW+yXi0BTmjmbdxAr1ve5OO0PqJJtORvtfD7PKmUv NjCSKuzJoe+jFBJGsYKcfxUK/V29kQgw+K7QzFu71b5dDA10td0WFPKdyGebCyJ4xNPU FtPg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=iqLS8IfQ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n23-v6si4427749plp.77.2018.10.04.01.55.40; Thu, 04 Oct 2018 01:55:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=iqLS8IfQ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727816AbeJDPrg (ORCPT + 99 others); Thu, 4 Oct 2018 11:47:36 -0400 Received: from mail-pg1-f196.google.com ([209.85.215.196]:42746 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727354AbeJDPrf (ORCPT ); Thu, 4 Oct 2018 11:47:35 -0400 Received: by mail-pg1-f196.google.com with SMTP id i4-v6so2835540pgq.9; Thu, 04 Oct 2018 01:55:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=gh3ECq3VaZvkexK5JHWw6C3ee8+sG3qrVox2UXqj1oA=; b=iqLS8IfQrbqT9pqNRYkfPABUmMc0gSlYp4dXgsiAcdruf7C1Was4sx8hCnQWTbH7YF u6+DQW2hFPGlHSL2GTeXvzwQqbvibc+JaEG9xOb5TZJPpqdZ84rOleQqXLqld0Itgo16 9j+aW9qGPy/F4hTa1LnQrq+CRDXpESeNZX0Ku2W1VW7nU8xH/auwz07YFERB7a718H8+ N64ZHvgr6+9EjywdL2PLy81kLb1FRZ3pjCUpxz1l/VTxHYdapGJCihL/IEmK0xmPPOpa CxtXi3m2y6qJLrQFRflkUs+CG/Mr9Hd4o4NC8yVTKMPsdTCm6N7jgAi9ueOMu7HTHyle UldA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=gh3ECq3VaZvkexK5JHWw6C3ee8+sG3qrVox2UXqj1oA=; b=IHOCkIfUIVHit0LZecF/1xVZSMjCbLUSyJX17/YObENGLY7Flx3sUeNIsQ3EqzipgH 0NIwFvSPdWAgLk1ZXWr1vfQx5Hhfqa1C/7VdM4hj0tgExu3RebTKzBdOgHXeOl9Pm1m1 fxp3pj9xbAbMxdtkb1X2r679kemdXf8ySw8N9MtKBd+nBlGxFMo45yhg0ShkJMuZDRo7 vRC4goxmvJC8NyKIHVhv5QFRo9HRRsAnmPP//fcHBhq1t15lMXFsQ9xb24BKt56uAqEw rmSu5NVh8/1+C8hT02da9m68KFCqDqz/QOTQ08nDKd9x2ygYrIHh0FY4Veub/Te+syhX YnAQ== X-Gm-Message-State: ABuFfog7ySUhGe3dMfNHa3ZOoshB+/b91ZYEZv8av8kwy1ho1aQDWb2a 7FOfc/YG3aI2RxpGgWhppac= X-Received: by 2002:a63:66c3:: with SMTP id a186-v6mr4868123pgc.330.1538643320704; Thu, 04 Oct 2018 01:55:20 -0700 (PDT) Received: from localhost ([175.223.49.70]) by smtp.gmail.com with ESMTPSA id f83-v6sm5888689pfa.109.2018.10.04.01.55.18 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 04 Oct 2018 01:55:19 -0700 (PDT) Date: Thu, 4 Oct 2018 17:55:15 +0900 From: Sergey Senozhatsky To: Petr Mladek , Steven Rostedt Cc: Sergey Senozhatsky , Daniel Wang , rostedt@goodmis.org, stable@vger.kernel.org, Alexander.Levin@microsoft.com, akpm@linux-foundation.org, byungchul.park@lge.com, dave.hansen@intel.com, hannes@cmpxchg.org, jack@suse.cz, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Mathieu Desnoyers , Mel Gorman , mhocko@kernel.org, pavel@ucw.cz, penguin-kernel@i-love.sakura.ne.jp, peterz@infradead.org, tj@kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz, Cong Wang , Peter Feiner Subject: Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes" Message-ID: <20181004085515.GC12879@jagdpanzerIV> References: <20181002084225.6z2b74qem3mywukx@pathway.suse.cz> <20181002212327.7aab0b79@vmware.local.home> <20181003091400.rgdjpjeaoinnrysx@pathway.suse.cz> <20181003133704.43a58cf5@gandalf.local.home> <20181004074442.GA12879@jagdpanzerIV> <20181004083609.kcziz2ynwi2w7lcm@pathway.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181004083609.kcziz2ynwi2w7lcm@pathway.suse.cz> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On (10/04/18 10:36), Petr Mladek wrote: > > This looks like a reasonable explanation of what is happening here. > It also explains why the console owner logic helped. Well, I'm still a bit puzzled, frankly speaking. I've two theories. Theory #1 [most likely] Steven is a wizard and his code cures whatever problem we throw it at. Theory #2 console_sem hand over actually spreads print out, so we don't have one CPU doing all the printing job. Instead every CPU prints its backtrace, while the CPU which issued all_cpus_backtrace() waits for them. So all_cpus_backtrace() still has to wait for NR_CPUS * strlen(bakctrace), which still probably truggers NMI panic on it at some point. The panic CPU send out stop IPI, then it waits for foreign CPUs to ACK stop IPI request - for 10 seconds. So each CPU prints its backtrace, then ACK stop IPI. So when panic CPU proceeds with flush_on_panic() and emergency_reboot() uart_port->lock is unlocked. Without the patch we probably declare NMI panic on the CPU which does all the printing work, and panic sometimes jumps in when that CPU is in busy in serial8250_console_write(), holding the uart_port->lock. So we can't re-enter the 8250 driver from panic CPU and we can't reboot the system. In other words... Steven is a wizard. > > serial8250_console_write() > > { > > if (port->sysrq) > > locked = 0; > > else if (oops_in_progress) > > locked = spin_trylock_irqsave(&port->lock, flags); > > else > > spin_lock_irqsave(&port->lock, flags); > > > > ... > > uart_console_write(port, s, count, serial8250_console_putchar); > > ... > > > > if (locked) > > spin_unlock_irqrestore(&port->lock, flags); > > } > > > > Now... the problem. A theory, in fact. > > panic() sets oops_in_progress back to zero - bust_spinlocks(0) - too soon. > > I see your point. I am just a bit scared of this way. Ignoring locks > is a dangerous and painful approach in general. Well, I agree. But 8250 is not the only console which does ignore uart_port lock state sometimes. Otherwise sysrq would be totally unreliable, including emergency reboot. So it's sort of how it has been for quite some time, I guess. We are in panic(), it's over, so we probably can ignore uart_port->lock at this point. -ss