Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp5639828ybl; Tue, 27 Aug 2019 07:39:58 -0700 (PDT) X-Google-Smtp-Source: APXvYqzkCGpkJEiG9MzF2ksjqq85PEzMdUlq7IQIGujczzKG3bOhZ2n1d7PI2H+0LvOMJ3RM1Wqz X-Received: by 2002:a62:1941:: with SMTP id 62mr26552622pfz.188.1566916798775; Tue, 27 Aug 2019 07:39:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1566916798; cv=none; d=google.com; s=arc-20160816; b=fNvr8TqFvBx2KzVaCdzLNQwlbfA4oQcSxwWSCg8a7Uh/CjLUlVosSRVgzCJNcg4yIp 3jxsM3u8N+P+a1vkOBpoYIa5nzpwE8JfVRtHvK2zjenYHIwkI1GNxP/Rh89lEHlnFa9O tr/YfW7AyB2TeSzdqVYO1tbfNhNyNaWtfhJLgVO8qHlEUQGytNjkljoTIcFl/+CyfOOP ZzNhEi8b3d6eC/HKXNmNpyWtMmBxpjFdOT0EAygOA8DEQ7J3sznSfP7Q/7FplIrfyxTA 0/D50QSddaf/mwsieohltVBmdN8Q3yZFsDNSzpcmh8GIBNVZDE5E4chkXA6uv+NUbxkc qUuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=T6/tSKfuln1hi0wBoSEBvE7sg+jatX9BV9lI76VXFcg=; b=drunZFcT7enUzXYfraoBI/yLJSnl66yQqrGgcv3YKebSF23rt0ONf7YVqfCEop34Wu 1xvii+x+GcJLlLXeEj5RSvVidzQ9Ixubtq+opy9SyiuKUy3GIh95dKtjq0y7nt1WKLwN qHkEgyish9ZopBxdExRgSg0npS+gspNLhHDZQVBSryIhbvOCnxgJXzUkLYdb91oVP595 NBaYzzg+M2qB8sXm8h0iL55TFAajPsXypYtDUrrXSotZXC2Aa7ae3bOUSCwjdIHIPH6I cKINY2JocxMwWwE+ctDpy6HVg50ZycQRkV0xMMXk5HY2tgTLTDKn8i9vP1IymO7RcTR0 cjrw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l16si12280126pgt.568.2019.08.27.07.39.42; Tue, 27 Aug 2019 07:39:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730023AbfH0Ogz (ORCPT + 99 others); Tue, 27 Aug 2019 10:36:55 -0400 Received: from mx2.suse.de ([195.135.220.15]:55006 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728702AbfH0Ogj (ORCPT ); Tue, 27 Aug 2019 10:36:39 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id D0FE5ACC1; Tue, 27 Aug 2019 14:36:36 +0000 (UTC) Date: Tue, 27 Aug 2019 16:36:35 +0200 From: Petr Mladek To: John Ogness Cc: Andrea Parri , Sergey Senozhatsky , Sergey Senozhatsky , Steven Rostedt , Brendan Higgins , Peter Zijlstra , Thomas Gleixner , Linus Torvalds , Greg Kroah-Hartman , linux-kernel@vger.kernel.org Subject: Re: dataring_push() barriers Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation Message-ID: <20190827143635.4taqjj6wjz7gdlea@pathway.suse.cz> References: <20190807222634.1723-1-john.ogness@linutronix.de> <20190807222634.1723-2-john.ogness@linutronix.de> <20190820135004.7vatbrzphfsgsnw2@pathway.suse.cz> <20190820135004.7vatbrzphfsgsnw2@pathway.suse.cz> <87r25aklsy.fsf@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87r25aklsy.fsf@linutronix.de> User-Agent: NeoMutt/20170912 (1.9.0) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun 2019-08-25 04:42:37, John Ogness wrote: > On 2019-08-20, Petr Mladek wrote: > >> +/** > >> + * dataring_push() - Reserve a data block in the data array. > >> + * > >> + * @dr: The data ringbuffer to reserve data in. > >> + * > >> + * @size: The size to reserve. > >> + * > >> + * @desc: A pointer to a descriptor to store the data block information. > >> + * > >> + * @id: The ID of the descriptor to be associated. > >> + * The data block will not be set with @id, but rather initialized with > >> + * a value that is explicitly different than @id. This is to handle the > >> + * case when newly available garbage by chance matches the descriptor > >> + * ID. > >> + * > >> + * This function expects to move the head pointer forward. If this would > >> + * result in overtaking the data array index of the tail, the tail data block > >> + * will be invalidated. > >> + * > >> + * Return: A pointer to the reserved writer data, otherwise NULL. > >> + * > >> + * This will only fail if it was not possible to invalidate the tail data > >> + * block. > >> + */ > >> +char *dataring_push(struct dataring *dr, unsigned int size, > >> + struct dr_desc *desc, unsigned long id) > >> +{ > >> + unsigned long begin_lpos; > >> + unsigned long next_lpos; > >> + struct dr_datablock *db; > >> + bool ret; > >> + > >> + to_db_size(&size); > >> + > >> + do { > >> + /* fA: */ > >> + ret = get_new_lpos(dr, size, &begin_lpos, &next_lpos); > >> + > >> + /* > >> + * fB: > >> + * > >> + * The data ringbuffer tail may have been pushed (by this or > >> + * any other task). The updated @tail_lpos must be visible to > >> + * all observers before changes to @begin_lpos, @next_lpos, or > >> + * @head_lpos by this task are visible in order to allow other > >> + * tasks to recognize the invalidation of the data > >> + * blocks. > > > > This sounds strange. The write barrier should be done only on CPU > > that really modified tail_lpos. I.e. it should be in _dataring_pop() > > after successful dr->tail_lpos modification. > > The problem is that there are no data dependencies between the different > variables. When a new datablock is being reserved, it is critical that > all other observers see that the tail_lpos moved forward _before_ any > other changes. _dataring_pop() uses an smp_rmb() to synchronize for > tail_lpos movement. It should be symmetric. It makes sense that _dataring_pop() uses an smp_rmb(). Then there should be wmb() in dataring_push(). The wmb() should be done only by the CPU that actually did the write. And it should be done after the write. This is why I suggested to do it after cmpxchg(dr->head_lpos). > This CPU is about to make some changes and may have > seen an updated tail_lpos. An smp_wmb() is useless if this is not the > CPU that performed that update. The full memory barrier ensures that all > other observers will see what this CPU sees before any of its future > changes are seen. I do not understand it. Full memory barrier will not cause that all CPUs will see the same. My understanding of barriers is: + wmb() is needed after some value is modified and any following modifications must be done later. + rmb() is needed when a value has to be read before the other values are read. These barriers need to be symmetric. The reader will see the values in the right order only when both the writer and the reader use the right barriers. + wmb() full barrier is needed around some critical section to make sure that all operations happened inside the section Back to our situation: + rmb() should not be needed here because get_new_lpos() provided a valid lpos. It is possible that get_new_lpos() used rmb() to make sure that there was enough space. But such wmb() would be between reading dr->tail_lpos and dr->head_lpos. No other rmb() is needed once the check passed. + wmb() is not needed because we have not written anything yet If there was a race with another CPU than cmpxchg(dr->head_lpos) would fail and we will need to repeat everything again. > >> + /* fE: */ > >> + } while (atomic_long_cmpxchg_relaxed(&dr->head_lpos, begin_lpos, > >> + next_lpos) != begin_lpos); > >> + > > > > We need a write barrier here to make sure that dr->head_lpos > > is updated before we start updating other values, e.g. > > db->id below. > > My RFCv2 implemented it that way. The function was called data_reserve() > and it moved the head using cmpxchg_release(). For RFCv3 I changed to a > full memory barrier instead because using acquire/release here is a bit > messy. There are 2 different places where the acquire needed to be: > > - In _dataring_pop() a load_acquire() of head_lpos would need to be > _before_ loading of begin_lpos and next_lpos. > > - In prb_iter_next_valid_entry() a load_acquire() of head_lpos would > need to be at the beginning within the dataring_datablock_isvalid() > check (mC). > > If smp_mb() is too heavy to call for every printk(), then we can move to > acquire/release. The comments of fB list exactly what is synchronized > (and where). smp_mb() is not a problem. printk() is a slow path. My problem is that I want to make sure that the code works as expected. For this, I want to understand the used barriers. And the discussed full barrier in dataring_push() does not make sense to me. Best Regards, Petr