Received: by 2002:a05:7412:b795:b0:e2:908c:2ebd with SMTP id iv21csp390777rdb; Thu, 2 Nov 2023 06:49:09 -0700 (PDT) X-Google-Smtp-Source: AGHT+IES29NSHt7l3xUrb+MpOiVlJtWVIEtTULArnGQFDlunVohERDF6WQGiOcQYufShINIt8Pcr X-Received: by 2002:a05:6a21:a591:b0:16b:bd0f:ad06 with SMTP id gd17-20020a056a21a59100b0016bbd0fad06mr24663599pzc.31.1698932949542; Thu, 02 Nov 2023 06:49:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698932949; cv=none; d=google.com; s=arc-20160816; b=bbs496+fW6l+JPmTcHekSV7zltzS3+8qz4W48pS0rd/mi9/+qFHk9Wr5IFv74dwrEt piLMENiAUO3IJ79sfSBbS0sl7SFM8q1QKsA9poBbjizxLd2OJjrY6Adky/ytNdmdcSx0 FEX8li2wPyTKWlAcrLBsPeGZD5aLXIi8SA+hIi4k1N4MEsf+LkFZHG6k04hG4DdEJ1lT eIKnYIj7mw6N1t/pYz0jMhOOVaGgL0fdWajbzPMfjQoIQw7ynrwYie1/cAcrKT+0TUnY 8dA9a8TWDSWzL1dMPRO2m7DPWdDZR8lZ60wHNwRgwGiwlZYxwbnOGmsejE6NVxdeoNzl YxHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:in-reply-to:subject :cc:to:dkim-signature:dkim-signature:from; bh=exRsZeFA/Fox41PFwSzIwfdOdhyGCgCdI0norJsp0Y0=; fh=5WaRyK2ddkOKHNWfEzbbZu2HWObkrNOFcBnPB+dGIv8=; b=TgG7UEIN2ewLLG9J4fGJi4fZ9SfXkgpDSG2vdlQ8kGYbGvFHk6zUdJx9r1zaUJeq18 1KjPyg6vejddL9pWmjrKP1C3O5oiY0dtkVOYC7df3TMqndkM+hBvWPQ34aKY5w1fMcuE 8h8r9J0dz0Fx79RYLGGq021SX5ncioiwMdgy+s09PpoyAR1SiwQvXEJT9bG+ElBCA/2C MLaOu8N4om95/qTCwZAVOb2ZgGFfEOrv6EA70Blj8vC+YnmCflEmenZq/76MAtb03MbZ KFltw6aBn08XxpxW+kSjp8ZVtC+pyuUtah1NkZMolWtwZBsgTdwdPE6qwJTO5kubIJCd BN2w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=PVLXSieQ; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id w1-20020a170902d10100b001cc0c4c1c63si4628306plw.165.2023.11.02.06.49.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Nov 2023 06:49:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=PVLXSieQ; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 1E31C80A32E9; Thu, 2 Nov 2023 06:49:04 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1376664AbjKBNsf (ORCPT + 99 others); Thu, 2 Nov 2023 09:48:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52104 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229493AbjKBNse (ORCPT ); Thu, 2 Nov 2023 09:48:34 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B9F4B83 for ; Thu, 2 Nov 2023 06:48:27 -0700 (PDT) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1698932905; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to; bh=exRsZeFA/Fox41PFwSzIwfdOdhyGCgCdI0norJsp0Y0=; b=PVLXSieQA79uJ1mXex4K2zHQKFpBbDk/9aCIebp+kMHUU0WEoYv/sdEYkG3X7t1V9uj6V7 wWGH6OWqDztY7cttuBnVWw9Gzgg6/q6gosbWxbaMH/3U53yMdF+KdnYOksGxEdHki239JD Dd9YRaytfqWAQbVJhDDxnk4WkQIdzFgvUlpQjFq8IkyKoHpXMhCGIvRYp/IuOasLWOfMUc QHX9f+HIvXVFbpItPmxJ8J8e/5t4Zw5wzQQX0V1ruzK7/5s2TXjHkfzg6iAaqzGLv9OOfo KU4b2Us/Fx5ZcnRJb7Dejkg78HOzx9stpdmiTDDwUZX6awI4VpC6AmP4VZRbrw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1698932905; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to; bh=exRsZeFA/Fox41PFwSzIwfdOdhyGCgCdI0norJsp0Y0=; b=4DE/cjrYhMXRnuEsvR1e55oGsYizjf6FdiemXvqqyZ9c/32uf8W60IL/RY6OF4hYDP5sLG 76/ZUn6BI1TgH7Bg== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org, Mukesh Ojha , Chunlei Wang Subject: Re: [RFC PATCH printk v1] printk: ringbuffer: Do not skip non-finalized with prb_next_seq() In-Reply-To: Date: Thu, 02 Nov 2023 14:54:23 +0106 Message-ID: <87zfzwp8pk.fsf@jogness.linutronix.de> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INVALID_DATE_TZ_ABSURD, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Thu, 02 Nov 2023 06:49:04 -0700 (PDT) On 2023-10-25, Petr Mladek wrote: > there seems to be missing word in the subject: > > s/non-finalized/non-finalized records/ Ack. > On Thu 2023-10-19 15:31:45, John Ogness wrote: >> Commit f244b4dc53e5 ("printk: ringbuffer: Improve prb_next_seq() >> performance") introduced an optimization for prb_next_seq() by >> using best-effort to track recently finalized records. However, >> the order of finalization does not necessarily match the order >> of the records. This can lead to prb_next_seq() returning >> higher than desired sequence numbers, which results in the >> reader skipping over records that are not yet finalized. From >> the reader's perspective it results in messages never being >> seen. > > IMHO, "messages never being seen" is too strong. Agreed. A reader does not use prb_next_seq() to decide what to print next. Worst case it thinks records are available that are not (available for that reader). > I have found only one (or two) scenarios where the messages might > really get lost. > > 1. It might happen when real console is replacing a boot console. > The real console is initialized with the value returned > by prb_next_seq(). And the boot console might not be able > to flush earlier non-finalized records. This cannot happen because in this situation console_init_seq() sets @seq to the lowest boot console counter. > 2. The other scenario is based on the fact that console_unlock() > or pr_flush() might see lower prb_next_seq() than the last > reserved record. It means that they might not flush all > pending records. > > But wait! This is actually the opposite case. pr_flush() > and console_unlock() might miss the messages when > they see too low prb_next_seq(). > > Important: This problem existed since introducing > the lockless ring buffer! > > The question is. Should pr_flush() and console_unlock() > wait until all registered messages get finalized? > > It would need to ignore only the last record when it > is not finalized because it might be a continuous line. Yes, this is the question to answer. With the lockless ringbuffer we allow multiple CPUs/contexts to write simultaneously into the buffer. This creates an ambiguity as some writers will finalize sooner. IMHO we need 2 different functions: 1. A function that reports the last contiguous finalized record for a reader. This is useful for syslog and kmsg_dump to know what is available for them to read. We can use @last_finalized_seq for this, optimizing it correctly this time. 2. A function that reports the last reserved sequence number of a writer. This is useful for pr_flush and console_unlock to know when they are finished. This function can begin with @last_finalized_seq, looking for the last finalized record (skipping over any non-finalized). > I agree that this might be optimized. I think about reducing the > number of cmpxchg even more, something like: > > static void desc_update_last_finalized(struct prb_desc_ring *desc_ring) > { > struct prb_desc_ring *desc_ring = &rb->desc_ring; > u64 prev_seq = desc_last_finalized_seq(desc_ring); > u64 seq = prev_seq; > > try_again: > while (_prb_read_valid(rb, &seq, NULL, NULL)) > seq++; > > if (seq == prev_seq) > return; > > oldval = __u64seq_to_ulseq(prev_seq); > newval = __u64seq_to_ulseq(seq); > > if (!atomic_long_try_cmpxchg_relaxed(&desc_ring->last_finalized_seq, > &oldval, newval)) { > prev_seq = seq; > goto try_again; > } > } I am fine with this implementation. > It looks to me that we could keep passing desc_ring as the parameter. No, _prb_read_valid() needs it. > I feel that we need a read barrier here. It should be between the > above > > atomic_long_read(&desc_ring->last_finalized_seq)) > > and the below > > while (_prb_read_valid(rb, &seq, NULL, NULL)) > seq++; > > It should make sure that the _prb_read_valid() will see all messages > finalized which were seen finalized by the CPU updating > desc_ring->last_finalized_seq. Generally we have not concerned ourselves with readers. But I agree we should make the optimization coherent with what a reader can actually read. It might save some CPU cycles for polling tasks. Writing and reading of @last_finalized_seq will provide the necessary boundaries to guarantee this: ...finalize record... atomic_long_try_cmpxchg_release(&desc_ring->last_finalized_seq, ...); and atomic_long_read_acquire(&desc_ring->last_finalized_seq); ...read record... This guarantees that if a reader sees a certain @last_finalized_seq value, that they will also see the record that was finalized. This will be the 13th memory barrier pair to be added to the documentation. I would like to submit a new patch implementing things as described here. John