Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp5522965ybl; Tue, 27 Aug 2019 06:05:28 -0700 (PDT) X-Google-Smtp-Source: APXvYqwqSa1rHFelI2Ydpnf0osKq+p7MltNxL2cFCQFdW19wXWZmf4REtejDUm9EXYHW3/Qu5e+R X-Received: by 2002:a17:902:e493:: with SMTP id cj19mr23926354plb.292.1566911128070; Tue, 27 Aug 2019 06:05:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1566911128; cv=none; d=google.com; s=arc-20160816; b=Zfx8xM/f3I09p7+D7u1UfYFOabf7VM/foFdOvBtmY3vithqQWpzuo+Go1lNx4qM+kp LnbktQCsYnqI+cnBiY9UMKvJ0zzx4zaaxYarsdhSJdsfnRaw+3ymCLCOnzJDH7hqAkwY FH5Fx0C74sCKxYoaoW6KkGxXW6NGfgejDl2QPzUu22Ikh1XuRXghE7ZNeE4iHr0ABdAj ym9fdfhM+7ePGJOdNEOo0X5cgtNhiJhXENWnWeckfG2ZKgymGxZ6NCSA5xBGMmp/3eDs i0kD2UZJ3c72cHjYELJSWbmcRLJBze4n7Ekx3oaM4nLleuihCrs5ii6kcCCb8K/bXqwd L2ew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=Jd9c8Jbe3fead9kvHT/TYM/2mw0qrKZwe0ZhhykleBM=; b=jF22+1IRBe8BkwenFFB8ZEHLauSA4JB2EMSEXE+vtgDLzYHOLMxmJTp1YohRh2JA7o AvEbSIBdrbTM/i0vabe5rDAnppERwWgwQuVPkT390jR1au/gj/A35qD4n5ypoMuuAf1C nI7hwJr6uEQ43IvlpwDNwJgRJ4V+vaLeSpDO0+9K943f8c9nWECDqy2cMyISLlXEVRQt auUEcNs/MIrNI4D4Fpi+acOOvtsYJ1b6YymIXIgEWz08uzFJBx8uvMPf2M3Wxyw9HC/J g6Cc5MhNvQUP5yfTnfsuya1hAy3uPk4gHx1yQSKc81mgmU7Ygkxo1C0OfP7hmHF0OoDX +aRw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r26si12068893pge.408.2019.08.27.06.05.12; Tue, 27 Aug 2019 06:05:28 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727270AbfH0NDz (ORCPT + 99 others); Tue, 27 Aug 2019 09:03:55 -0400 Received: from mx2.suse.de ([195.135.220.15]:42108 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726054AbfH0NDz (ORCPT ); Tue, 27 Aug 2019 09:03:55 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id B883EB016; Tue, 27 Aug 2019 13:03:51 +0000 (UTC) Date: Tue, 27 Aug 2019 15:03:49 +0200 From: Petr Mladek To: John Ogness Cc: Andrea Parri , Sergey Senozhatsky , Sergey Senozhatsky , Steven Rostedt , Brendan Higgins , Peter Zijlstra , Thomas Gleixner , Linus Torvalds , Greg Kroah-Hartman , linux-kernel@vger.kernel.org Subject: Re: numlist API Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation Message-ID: <20190827130349.6mrnhdlqyqokgsfk@pathway.suse.cz> References: <20190807222634.1723-1-john.ogness@linutronix.de> <20190807222634.1723-2-john.ogness@linutronix.de> <20190823171802.eo2chwyktibeub7a@pathway.suse.cz> <20190823171802.eo2chwyktibeub7a@pathway.suse.cz> <87sgpnmqdo.fsf@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87sgpnmqdo.fsf@linutronix.de> User-Agent: NeoMutt/20170912 (1.9.0) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 2019-08-27 01:57:39, John Ogness wrote: > On 2019-08-23, Petr Mladek wrote: > >> --- /dev/null > >> +++ b/kernel/printk/numlist.c > >> @@ -0,0 +1,375 @@ > >> +// SPDX-License-Identifier: GPL-2.0 > >> + > >> +#include > >> +#include "numlist.h" > > > > struct numlist is really special variant of a list. Let me to > > do a short summary: > > > > + FIFO queue interface > > > > + nodes sequentially numbered > > > > + nodes referenced by ID instead pointers to avoid ABA problems > > + requires custom node() callback to get pointer for given ID > > > > + lockless access: > > + pushed nodes must not longer get modified by push() caller > > + pop() caller gets exclusive write access, except that they > > must modify ID first and do smp_wmb() later > > Only if the "numlist user" decides to recycle descriptors (which the > printk_ringbuffer does) is ID modification of descriptors necessary. How > that is synchronized with readers is up to the user (for example, > whether a RELEASE or an smp_wmb() is used). IMHO, the most tricky part of numlist API is handling of IDs. The IDs are there to avoid ABA races when reusing the nodes. I want to say that this API is useful only when the nodes are reused. All other users would want anything easier. > > + pop() does not work: > > + tail node is "busy" > > + needs a custom callback that defines when a node is busy > > Note that busy() could always return false if the user has no concept of > nodes that should not be popped. This would be append only list. Again the is no need for IDs and other complexities. > > + tail is the last node > > + needed for lockless sequential numbering > > > > I will start with one inevitable question ;-) Is it realistic to find > > another user for this API, please? > > If someone needs a FIFO queue that supports: > > 1. multiple concurrent writers and multiple concurrent non-consuming > readers > > 2. where readers are allowed to miss nodes but are able to detect how > many were missed > > 3. from any context (including NMI) > > then I know of no other data structure available. (Otherwise I would > have used it!) It might be also because nobody else needed structure with our numlist semantic. I guess that lockless read/write structures are usually implemented using RCU. > > I am not sure that all the indirections, caused by the generic API, > > are worth the gain. > > IMHO the API is sane. The only bizarre rule is that the numlist must > always have at least 1 node. But since the readers are non-consuming, > there is no real tragedy here. > > My goal is not to create some fabulous abstract data structure that > everyone should use. But I did try to minimize numlist (and dataring) to > only be concerned with clearly defined and minimal responsibilities > without imposing unnecessary restrictions on the user. The API is complicated because of the callbacks. It depends on a logic that is implemented externally. It makes it abstract to some extent. My view is that the API would be much cleaner and easier to review when the ID handling is "hardcoded" (helper functions). It could be made abstract anytime later when there is another user. There should always be a reason why to make a code more complicated than necessary. It seems that the only reason is some theoretical future user and its theoretical requirements. > > Well, the separate API makes sense anyway. I have some ideas that > > might make it cleaner. > > [snipped the nice refactoring of the ID into the nl_node] > > Your idea (along with previous discussions) convinced me of the > importance of moving the ID-related barriers into the same > file. However, rather than pushing the ID parts into the numlist, I will > be moving them all into the "numlist user" > (i.e. printk_ringbuffer). Your use of the ACQUIRE to load the ID made me > realize that I need to be doing that as well! (but in the node() > callback) > > The reasons why I do not want the ID in nl_node is: > > - The numlist would need to implement the ID-to-node mapping. For the > printk_ringbuffer that mapping is simply masking to an index within an > array. But why should a numlist user be forced to do it that way? I > see no advantage to restricting numlists to being arrays of nodes. It might be done a generic way when there is a user with another need. Honestly, I have big troubles to imagine another reasonable mapping between id and pointer than masking. We are talking about lockless code. Anything more complicated might become a nightmare. > - The dataring structure also uses IDs and requires an ID-to-node > mapping. I do not want to bind the dataring and numlist data > structures together at this level because they really have nothing to > do with each other. Having the dataring and numlist ID-to-node > mappings (and their barriers) in the same place (in the > numlist/dataring _user_) simplifies the big picture. The ID is used in all three APIs. Then it might be only a matter of taste where it is stored. I still feel that struct nl_node is better place because: + already includes next_id + includes seq that identifies the structure another way + id describes the node I still need to think more about the other APIs. Well, id substitutes a pointer here. It is like struct list_head pointer. It is normal that it is passed as parameter by a list API user. > I believe your main concern was having easily visible symmetric > barriers. We can achieve that if the read-barriers are in the callbacks > (for both numlist and dataring). I think it makes more sense to put them > there. dataring and numlist should not care about the ID-to-node > mapping. Symmetry is really important. It is often sign of a good design. Simple and straightforward code is another important thing at this stage. The code is complicated and we need to make sure that it works. Any optimizations and generalization might be done later when needed. Best Regards, Petr