Received: by 2002:ab2:6203:0:b0:1f5:f2ab:c469 with SMTP id o3csp267509lqt; Thu, 18 Apr 2024 14:45:13 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVmIAwX3BBI7LwxvyXfjOEzDY8jvEqimEIpLcc5Lg5VyvpO0zaFb1PKqtYnm/QDVe0SgXVSrYF0pHDjVEJjbGF3NsonykrC/YiUmn1y3Q== X-Google-Smtp-Source: AGHT+IFogiY9yN6PUxIBztuwTcYe2YrDh1G6XDvZ1MDumoOptlo0SOrSzxKbBaNJSIPQqaEv33mj X-Received: by 2002:a17:906:2758:b0:a4e:946a:3b0a with SMTP id a24-20020a170906275800b00a4e946a3b0amr290490ejd.34.1713476713462; Thu, 18 Apr 2024 14:45:13 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1713476713; cv=pass; d=google.com; s=arc-20160816; b=Vik94r6OBvHILwAqcgmNT0ehrQcrkKeV6gCt48mAwyYZpQH+7KabT2iX3aI6ibRkgZ v05daL9tC4bGwENzIH1vQyj0XIEaH5BHpgGCIjGP7TcykA0uxH2GidXtv7VKg5v2L4oD nlwOBKu40Wp/Kr62O/IbwQBXVtNq2Ak/2WoU8aQ95QsT/mPDpZ9lR94w+zYVEIRIXfhw jxwqJ8J7B6TFCoOQ0BboUhx3+ne/jdMywAmTfkbmXXnfwM9DuNlZoIuPvT8/TJB8qX2W UzvrDkWK8Qz3bPFzVPXNW7yLw0X954oRkK7YUjWqmg7mXsbxyfyrocq9uN0Vwk9HDA/9 uDzA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:list-unsubscribe:list-subscribe:list-id:precedence :message-id:date:references:in-reply-to:subject:cc:to:dkim-signature :dkim-signature:from; bh=sr+UNQV62Ol6VrGa4nW/aaDoGmYQye3tnIA9PG9k83w=; fh=pKnPKUsir0uEGSsor+4Zc2vgbu+g+ayvUgsdzkuXaoA=; b=OsKxKof7zUJR0hc9SurO2F+/y+QvXGyJz/X49bhaFIxziTqr50USVFxFhBh2Gwl8Ot XT3uEd6CRnSfkQlItMDgV89ea9wE9OC6ET9Kt6ZbHxXuuhJtIcyyRt0hku4qMhtyt9sn K03uz61xlQo+ZJn+GZFmZaURizSvPq2yHwO/1JEQQtwMpcXWpWDLsL6jMN5c4ggwVDkg wx1li9yYFHNXwAjkruE0/AsQy841tZ8PoV2WLlQTUqi7WgmE4kZcRbqXlYkAJNMFSuXS Qec02HzDWuw+bvXvxZ7jsmrJYXvHf8Ikagyia1wKgqr1FPMs5+xmyhqOmUKYCZ+K4uBd ZU0A==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=nAYQubXT; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; arc=pass (i=1 spf=pass spfdomain=linutronix.de dkim=pass dkdomain=linutronix.de dmarc=pass fromdomain=linutronix.de); spf=pass (google.com: domain of linux-kernel+bounces-150812-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-150812-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id bu18-20020a170906a15200b00a4422dcbc2bsi1262650ejb.774.2024.04.18.14.45.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Apr 2024 14:45:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-150812-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=nAYQubXT; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; arc=pass (i=1 spf=pass spfdomain=linutronix.de dkim=pass dkdomain=linutronix.de dmarc=pass fromdomain=linutronix.de); spf=pass (google.com: domain of linux-kernel+bounces-150812-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-150812-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 0E5C71F21FD5 for ; Thu, 18 Apr 2024 21:45:13 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B1D4A199E87; Thu, 18 Apr 2024 21:45:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="nAYQubXT"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="YzSj/zk3" Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 16E4B16ABFF for ; Thu, 18 Apr 2024 21:45:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713476706; cv=none; b=A10XJsws10OhhKGXDojYpMzCzp1GyAhUCX4tsVpm+53mVM8qW5vM0EySA3erPCXVWGUPk7+MXhioTNzA6bZC3WRnKnei9ID1GP7Mdny1neKm/7d+GxfJzTSLayEfaBF9tYhv8jTAjNPnxYyf+J4PdY7sAgDItKSsotCJSM7vu7U= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713476706; c=relaxed/simple; bh=1bm/wORtkQCdpXihuwt8gUE+Ki2iVd0zJjtYxfzl9SI=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=tdSG/zz6d/5mYVRDdLLYUWmAbDO7ERqAn7gd1r9rvTf09AQ0IXstAA61HKoWQzoP7DOWWZvmnReMf8b2igMyfNORMzmuxnnzAlCSX4J3EYZ/RmP4NNQTq0D7zrIN0jxjIVRPGuzeLP2hhJqDRfA+p/hdcCcll0xQwtzqBALozbs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=nAYQubXT; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=YzSj/zk3; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1713476703; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=sr+UNQV62Ol6VrGa4nW/aaDoGmYQye3tnIA9PG9k83w=; b=nAYQubXTPnfoZjcGza6PlxJt1Q8CfxUqqrdXEtpFRGwOYWHqAtSgyI04m51ZLyTTSBMBj2 NQAr7/gXaQDfX5f80sCiy+mNvI+onK6WnJENJdxBry3Xtr4a/I3a+seBfjG0kiIVi7vkay d+T/6Etlg5PbHz6zPrXpfXKaE+qjbqrUlsGg1ICHioogdxJLMsBICbMskHXVkFDkieEFni kbEdYsRlF18H1MrHAmkfnFvlQykTvZOZlBWS6sCCZ2sNblipaospo1L8AlgRzH3uqiAP9Q nNfPHyLrWWZX0k9WWakYw/DGeOtyWEgKW9bThp+Pl5G/ozG5kgvWb/MpWIRtCA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1713476703; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=sr+UNQV62Ol6VrGa4nW/aaDoGmYQye3tnIA9PG9k83w=; b=YzSj/zk3vPAxj8zDTwM8SgAO2maHMQt0Ai8RP/bVJDxkoRJi3lG5waLiRBA3WdYLv7W8ql hgkOPy8zDsfLk7Cw== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org Subject: Re: [PATCH printk v4 17/27] printk: nbcon: Use nbcon consoles in console_flush_all() In-Reply-To: References: <20240402221129.2613843-1-john.ogness@linutronix.de> <20240402221129.2613843-18-john.ogness@linutronix.de> <87plunoai0.fsf@jogness.linutronix.de> Date: Thu, 18 Apr 2024 23:51:01 +0206 Message-ID: <87v84e4a76.fsf@jogness.linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain On 2024-04-18, Petr Mladek wrote: >> > Solve this problem by introducing[*] nbcon_atomic_flush_all() >> > which would flush even newly added messages and >> > call this in nbcon_cpu_emergency_exit() when the printk >> > kthread does not work. It should bail out when there >> > is a panic in progress. >> > >> > Motivation: It does not matter which "atomic" context >> > flushes NORMAL/EMERGENCY messages when >> > the printk kthread is not available. >> >> I do not think that solves the problem. If the console is in an unsafe >> section, nothing can be printed. > > IMHO, it solves the problem. > > The idea is simple: > > "The current nbcon console owner will be responsible for flushing > all messages when the printk kthread does not exist." Currently this is not valid. It assumes owners are printers. That is not always true. The owner might be some task modifying the baud rate and has nothing to do with printing. > The prove is more complicated: > > 1. Let's put aside panic. We already do the best effort there. > > 2. Emergency mode currently violates the rule because > nbcon_atomic_flush_pending() ignores the simple rule. > > => FIX: improve nbcon_cpu_emergency_exit() to flush > all messages when kthreads are not ready. Emergency mode cannot flush _anything_ if there is an owner in an unsafe region. (And that owner may not be a printer.) > 3. Normal mode flushes nbcon consoles via > nbcon_legacy_emit_next_record() from console_unlock() > before the kthreads are started. > > It is not reliable when nbcon_try_acquire() fails. > But it would fail only when there is another user. > The other owner might be: > > + panic: will handle everything > > + emergency: should flush everything [*] > > + normal: can't happen because of con->device() lock. As the code is now, "normal" does not imply con->device() lock. When using con->write_atomic(), we do not (and can not) use the con->device() lock. So normal _can_ fail in nbcon_legacy_emit_next_record() if another CPU is adjusting the baud rate. This is why I said the problem with "emergency" is basically the same problem as "normal" (WRT using write_atomic()). > => The only remaining problem is to fix nbcon_atomic_flush_pending() > to flush everything when the kthreads are not started yet. I think my proposed change handles it better. I have been doing various tests and also adjusted it a bit. The reason the flushing fails is because another context owns the console. So I think it makes sense for that owner to handle flushing responsibility when releasing ownership (even if that context was just changing the baud rate). [ Keep in mind we are only talking about printing via write_atomic() when the kthread is not available. ] If the current owner is a normal printing context, it will print to completion anyway (via console_flush_all()). If the current owner is an emergency printing context, it will only print the emergency messages (as PRIO_EMERGENCY). However, when it releases ownership, it could flush the remaining records (as PRIO_NORMAL) in the same fashion as console_flush_all() does it. If the current owner is a non-printing context, when it releases ownership, it could flush the remaining records (as PRIO_NORMAL) in the same fashion as console_flush_all() does it. So what I am proposing is that we add two new normal-flushing sites that are only used when the kthread is not available: 1. after exiting emergency mode: in nbcon_cpu_emergency_exit() 2. after releasing ownership for non-printing: in nbcon_driver_release() I think this will close the gap and it does not require irq_work. > Sigh, all this is so complicated. I wonder how to document > this so that other people do not have to discover these > dependencies again and again. Is it even possible? In the end we will have the following 5 scenarios (assuming my proposal): 1. PRIO_NORMAL non-printing activity, always under con->device() lock, upon release flushes[*] full backlog 2. PRIO_NORMAL printing using write_thread(), always called from task context and under con->device() lock, always flushes full backlog 3. PRIO_NORMAL printing using write_atomic(), called with hardware interrupts disabled, always flushes full backlog, (only used when the kthread is not available) 4. PRIO_EMERGENCY printing using write_atomic(), called with hardware interrupts disabled, flushes through emergency, upon exit flushes[*] full backlog 5. PRIO_PANIC printing using write_atomic(), called with hardware interrupts disabled, flushes full backlog [*] Only when the kthread is not available. And in that case #3 is the scenario used for the trailing full flushing by #1 and #4. Full flushing is attempted in all 5 scenarios. A failed attempt means there is a new owner, but that owner is also one of the 5 scenarios. Am I missing something? IMHO #3 is the only bizarre scenario. It exists only to cover when the kthread is not available. John