Received: by 2002:a05:7412:f589:b0:e2:908c:2ebd with SMTP id eh9csp808348rdb; Wed, 1 Nov 2023 03:33:56 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFqVQVmHGgeAjB7LUTf//W0lh2Y1/y56dNhC15CiKfznkak8NY2mC8CUnoN3DXnY/yMjZcc X-Received: by 2002:a05:6870:cd93:b0:1e9:a713:7ad6 with SMTP id xb19-20020a056870cd9300b001e9a7137ad6mr18925490oab.44.1698834835758; Wed, 01 Nov 2023 03:33:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698834835; cv=none; d=google.com; s=arc-20160816; b=XVsf0z0hQgXJxdcBwnUBhJwum424p0LYh6IiIOwaHa7hRekuGH+foHMX3FhKt8ZpZ6 he13UEfPmCA8SaGrAR3yzVxp1wY1MBvomXKgESYe6SUFJ9/CNA7jLash/WArcNf06KXi iSevgw0mY0w0k1XGJHaElu4fcPoqqCK0v9HE+tdYb2EBG5+imNYDRuYanKmOJ3s43jYs e4VJF72/7wjV59iDnX8J38UDYjjcFCxIbinEa9SAtzafjjFa+PTIv2Q72h1YGgJnsck/ gCKAJPfHZ25JuDH2oE17b64F7niHXyf7yJqepUR+z/55sWQPs/jGtB+h8++cmLVbY7CN F0Kg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:date:from:dkim-signature; bh=GJ7Pkdq68DdGuC7oYx1WEucucKxc3YExNfOgXseBzwA=; fh=V1zJtC+D5O/VznFl3thE/9YrmC96ZsMeUlrxUpB0YwY=; b=zUDQknkseyFPN57ngv5+eUkypVkAm6JwoIJ/X+NAPYhByJMjtRt8qvkCAs1bdW/YEF LzhCfy/D3+EFHYpSQR4sc4G++9ZYTefMBvAil8pYVtuVEoOTJM8QClARe6znrEIpfxik soypTP/0GjR7DpvmetGDpF4g88/W1HS46sIrJPFVurflnrNSTB8S1yXqkfZxYYah3S4r zxVHfczXn+H7urkGSZn9J0QYC2rce6vL9YFH/Th2yJzfFBmNI6DaOXh2NOiPW83KETJW 8YGMCs34y7xKe9CFM54HrUJafAmYHGELTzp4CS1tiPUDGFpNuVyFthpRm5eSnLGfOym0 4O2w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=KBoxk78a; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id i69-20020a638748000000b005b9602a7ba3si2615769pge.475.2023.11.01.03.33.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 03:33:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=KBoxk78a; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id AA60F8054C1E; Wed, 1 Nov 2023 03:33:52 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231476AbjKAKdq (ORCPT + 99 others); Wed, 1 Nov 2023 06:33:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40002 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229518AbjKAKdo (ORCPT ); Wed, 1 Nov 2023 06:33:44 -0400 Received: from mail-lj1-x22b.google.com (mail-lj1-x22b.google.com [IPv6:2a00:1450:4864:20::22b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 60B38DE; Wed, 1 Nov 2023 03:33:41 -0700 (PDT) Received: by mail-lj1-x22b.google.com with SMTP id 38308e7fff4ca-2c5056059e0so94558821fa.3; Wed, 01 Nov 2023 03:33:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1698834819; x=1699439619; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:from:to:cc:subject:date:message-id:reply-to; bh=GJ7Pkdq68DdGuC7oYx1WEucucKxc3YExNfOgXseBzwA=; b=KBoxk78aF39+rU/vFAasohAunITAHnbplGjQjZwZ6N91QDESpUq2xElQnOdUZtzfSY FxEw7ioQuu+qqbGXTKJzsHYZGLhlHx3XOnTWnvLTgrIli517kGcPwrgb4CfjxTLqqlU7 a97+tnc28tDVIq9u9zWWZNXvCFulg59WcO923y9K38Bm5WHmmkVYUKDBV0GT1Vrlw2Mc a1Rcz5FQP+uHwq/3ygIA+9n01mqizsfVnXBUW1TbpHj/JJea6IFsmxu01mx3MQ37dlCf djVOk7zQ7mETyUCs4Yv9rkO6bMz+wCpNxOrVxNZyXr+0PlRRnGmHzQD8VyDOfYeuu2vB VP2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698834819; x=1699439619; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=GJ7Pkdq68DdGuC7oYx1WEucucKxc3YExNfOgXseBzwA=; b=JaQIsXiIJ3N4JjJu2jsU0dsqsgXohTORHXMPXGtqJlrMStxFgQe6J03XX41Zuke4yu aDjs+C8HzJhRk+FKnFCr0Mpi4Px3BSeiboUWTL/1S/RGW0oIj02T94vpE+vsKaMJ/FZf ZCZZu4dE/a6ZzLik0L7wpItRDm+e14NJRJ3+Rc7c8rp4tKk6jPnddn6ecCSi1dLu3cvF m8N7rWFN2xmibDGUSUkNJil6gr2+0jw/H4wv3QR53nwkudVhRnQso2HrnVu+SEciaX/D kR+hEOQQuQZIs3hWdC58bXW8Nf8k4CkQvKOA6/uCzMBpxSq0DxO8HkatejCqBZHU4y7A 5few== X-Gm-Message-State: AOJu0Yzp4YVDP+lM0IFoU4Zyj2teh0jhSf16ous42Iu1qUh3Ji8kpHz7 mUSUnm3hEYOHd++a/dhNsk0= X-Received: by 2002:a05:6512:401b:b0:509:e5e:232a with SMTP id br27-20020a056512401b00b005090e5e232amr10433720lfb.42.1698834819102; Wed, 01 Nov 2023 03:33:39 -0700 (PDT) Received: from pc636 (host-90-233-220-95.mobileonline.telia.com. [90.233.220.95]) by smtp.gmail.com with ESMTPSA id be34-20020a056512252200b00507a0098421sm180455lfb.181.2023.11.01.03.33.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 03:33:38 -0700 (PDT) From: Uladzislau Rezki X-Google-Original-From: Uladzislau Rezki Date: Wed, 1 Nov 2023 11:33:35 +0100 To: Boqun Feng Cc: Uladzislau Rezki , "Paul E . McKenney" , RCU , Neeraj upadhyay , Hillf Danton , Joel Fernandes , LKML , Oleksiy Avramchenko , Frederic Weisbecker Subject: Re: [PATCH 1/3] rcu: Reduce synchronize_rcu() waiting time Message-ID: References: <20231025140915.590390-1-urezki@gmail.com> <20231025140915.590390-2-urezki@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Wed, 01 Nov 2023 03:33:52 -0700 (PDT) On Sun, Oct 29, 2023 at 11:21:11AM -0700, Boqun Feng wrote: > On Thu, Oct 26, 2023 at 03:09:02PM +0200, Uladzislau Rezki wrote: > [...] > > > Late to the party, but I kinda wonder whether we can resolve it by: > > > > > > 1) either introduce a separate seglist that only contains callbacks > > > queued by call_rcu_hurry(), and whenever after an GP and callbacks are > > > ready, call_rcu_hurry() callbacks will be called first. > > > > > > 2) or make call_rcu_hurry() callbacks always inserted at the head of the > > > NEXT list instead of the tail, e.g. (untested code): > > > > > > diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c > > > index f71fac422c8f..89a875f8ecc7 100644 > > > --- a/kernel/rcu/rcu_segcblist.c > > > +++ b/kernel/rcu/rcu_segcblist.c > > > @@ -338,13 +338,21 @@ bool rcu_segcblist_nextgp(struct rcu_segcblist *rsclp, unsigned long *lp) > > > * absolutely not OK for it to ever miss posting a callback. > > > */ > > > void rcu_segcblist_enqueue(struct rcu_segcblist *rsclp, > > > - struct rcu_head *rhp) > > > + struct rcu_head *rhp, > > > + bool is_lazy) > > > { > > > rcu_segcblist_inc_len(rsclp); > > > rcu_segcblist_inc_seglen(rsclp, RCU_NEXT_TAIL); > > > - rhp->next = NULL; > > > - WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rhp); > > > - WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], &rhp->next); > > > + /* If hurry and the list is not empty, put it in the front */ > > > + if (!is_lazy && rcu_segcblist_get_seglen(rsclp, RCU_NEXT_TAIL) > 1) { > > > + // hurry callback, queued at front > > > + rhp->next = READ_ONCE(*rsclp->tails[RCU_NEXT_READY_TAIL]); > > > + WRITE_ONCE(*rsclp->tails[RCU_NEXT_READY_TAIL], rhp); > > > + } else { > > > + rhp->next = NULL; > > > + WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rhp); > > > + WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], &rhp->next); > > > + } > > > } > > > > > > /* > > > diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h > > > index 4fe877f5f654..459475bb8df9 100644 > > > --- a/kernel/rcu/rcu_segcblist.h > > > +++ b/kernel/rcu/rcu_segcblist.h > > > @@ -136,7 +136,8 @@ struct rcu_head *rcu_segcblist_first_cb(struct rcu_segcblist *rsclp); > > > struct rcu_head *rcu_segcblist_first_pend_cb(struct rcu_segcblist *rsclp); > > > bool rcu_segcblist_nextgp(struct rcu_segcblist *rsclp, unsigned long *lp); > > > void rcu_segcblist_enqueue(struct rcu_segcblist *rsclp, > > > - struct rcu_head *rhp); > > > + struct rcu_head *rhp, > > > + bool is_lazy); > > > bool rcu_segcblist_entrain(struct rcu_segcblist *rsclp, > > > struct rcu_head *rhp); > > > void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp, > > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > > > index 20d7a238d675..53adf5ab9c9f 100644 > > > --- a/kernel/rcu/srcutree.c > > > +++ b/kernel/rcu/srcutree.c > > > @@ -1241,7 +1241,7 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp, > > > sdp = raw_cpu_ptr(ssp->sda); > > > spin_lock_irqsave_sdp_contention(sdp, &flags); > > > if (rhp) > > > - rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp); > > > + rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp, true); > > > rcu_segcblist_advance(&sdp->srcu_cblist, > > > rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq)); > > > s = rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq); > > > diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h > > > index 8d65f7d576a3..7dec7c68f88f 100644 > > > --- a/kernel/rcu/tasks.h > > > +++ b/kernel/rcu/tasks.h > > > @@ -362,7 +362,7 @@ static void call_rcu_tasks_generic(struct rcu_head *rhp, rcu_callback_t func, > > > } > > > if (needwake) > > > rtpcp->urgent_gp = 3; > > > - rcu_segcblist_enqueue(&rtpcp->cblist, rhp); > > > + rcu_segcblist_enqueue(&rtpcp->cblist, rhp, true); > > > raw_spin_unlock_irqrestore_rcu_node(rtpcp, flags); > > > if (unlikely(needadjust)) { > > > raw_spin_lock_irqsave(&rtp->cbs_gbl_lock, flags); > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > > > index cb1caefa8bd0..e05cbff40dc7 100644 > > > --- a/kernel/rcu/tree.c > > > +++ b/kernel/rcu/tree.c > > > @@ -2670,7 +2670,7 @@ __call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy_in) > > > if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags, lazy)) > > > return; // Enqueued onto ->nocb_bypass, so just leave. > > > // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock. > > > - rcu_segcblist_enqueue(&rdp->cblist, head); > > > + rcu_segcblist_enqueue(&rdp->cblist, head, lazy_in); > > > if (__is_kvfree_rcu_offset((unsigned long)func)) > > > trace_rcu_kvfree_callback(rcu_state.name, head, > > > (unsigned long)func, > > > > > Surprisingly, this survives from a whole rcutorture run ;-) > > > > Sure, there may be some corner cases I'm missing, but I think overall > > > this is better than (sorta) duplicating the logic of seglist (the llist > > > in sr_normal_state) or the logic of wake_rcu_gp() > > > (synchronize_rcu_normal). > > > > > > Anyway, these are just if-you-have-time-to-try options ;-) > > > > > Hm.. You still mix callbacks and there is a dependency in order > > of execution. The callback process time also might be varied from > > one callback to another. > > > > If you have many *_hurry() calls we end in the same situation. Apart > > I plan to resolve that by only puting a call_rcu_hurry(wakeme_after_gp) > in the front of the list. > > > of that we also have !CONFIG_RCU_NOCB_CPU path that is also covered > > by the patch that is in question. > > I don't see why the above approach doesn't work for > !CONFIG_RCU_NOCB_CPU, but I maybe miss something here. > Basically it does not work, because you do not fix the mixing "issue". I have been working on it and we agreed to separate it. Because it is just makes sense. The reason and the problem i see, i described in the commit message of v2. > > Do you have a benchmark I can try out to see if my diff can achieve the > similar result? Thanks! > There is no a good benchmark. But you can write it for sure. I tested three scenarios: - Run a camera app on our Android devices. Measuring app launch in milliseconds; - Doing synchronize_rcu() and kfree(ptr) simultaneously by 10K/etc workers. It is important test case because we have a fallback to this scenario for our kvfree_rcu_mightslepp() API. - I had a look at time delta of loading 100 kernel modules. That were my main test cases. -- Uladzislau Rezki