Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp156367ybl; Tue, 20 Aug 2019 17:28:45 -0700 (PDT) X-Google-Smtp-Source: APXvYqzmZSVQycYpx5QXHr+qBHHonKar/CkIW0KEku6ce/aCFLuaX/LOkrpI8yKyPxOXGUGpfm+d X-Received: by 2002:a17:902:d715:: with SMTP id w21mr32009370ply.261.1566347325807; Tue, 20 Aug 2019 17:28:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1566347325; cv=none; d=google.com; s=arc-20160816; b=Y2y8z1hAz+TKX3tvzppFH4Mp7+1L6tuRagK7yEEKJofq6EZ5fT+OwU2zO2/DOgd56p F3nQTns16pa677/h8ZZ/Saj3oqLKw89PraKmFAyFsSVshoChfplcQgoMrwcmHSmjCIMU ut3/eBZQ6oWEoZrrMGJ8BlwG14J86IuO/JBa3kRsAtNLbjdgRXagwbgOHeBpBvS6SM2q KEE3K3iDwltGkRxfENjy+6KxtbmHeGruK9tXHePvHo6ayv5Yqol3TiErnAgRD+TO2Zw0 ejS/1M0oao/IJH46gJeRHJmH7qjhIArYVueaibz0kmSRVauAHIvjGckSKgRomT1MQPr7 yS+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=7ajomM+Tb3ibNiEApsgamZJCzz/2BJZLgWqV9OMg1vM=; b=VYhcNZl6WR9S0swaGFYXq0J771gRul8jWpAnyOY8mEJ7fPfn7w+5jiIPtUrq5OFJMU 5PDIY66os83FsaHG06OwSXmHvqiCKe4dE25tvesARLIMpmH8gBmKhmY1t8OUN85RSVkF 4/Q7jBT/DdKcukA4QzlPWyTnmRlyJM/yT953wT6w7a+PfOPBx9hLnSC2dVfts3odHzqz 0R0uKRmDnccVeLW9X7f3loOPkqrurxMNowu4Wr5z514tFoQgWdsIr9HYFJw5rX2HADvA 1TzMP5GgDBT9fuTcIUMXMJxIOMb20mL/bI1aNegZTt6mrDE8jg6z89OHU3KM3oGasUWd B65A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=AvkqA2ev; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s10si13411025plr.43.2019.08.20.17.28.29; Tue, 20 Aug 2019 17:28:45 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=AvkqA2ev; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726617AbfHUA1Y (ORCPT + 99 others); Tue, 20 Aug 2019 20:27:24 -0400 Received: from mail-pg1-f193.google.com ([209.85.215.193]:34085 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726391AbfHUA1Y (ORCPT ); Tue, 20 Aug 2019 20:27:24 -0400 Received: by mail-pg1-f193.google.com with SMTP id n9so265663pgc.1 for ; Tue, 20 Aug 2019 17:27:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=7ajomM+Tb3ibNiEApsgamZJCzz/2BJZLgWqV9OMg1vM=; b=AvkqA2evSsfMSPgL+gvk9MtvmUSh1ZG0z5TyzsNc52/xwA+76++C2C3GkR821zLVWh NghIO8tkHNQJHUEe+rmBFhHAXlHu+TWMGLQxk6C15lUiP7CE1rg779sqcEbaGdt8tsNZ Dh0ZFmD1r52FhThtj6fQoAiDSxmoz4xyVUSQU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=7ajomM+Tb3ibNiEApsgamZJCzz/2BJZLgWqV9OMg1vM=; b=XKD4j27pk2u82Z+/3VExWBuxufMjlG0cNGqZf5HSEUfCzB5MjEuySePIcwHQwRBtYx oPQH+spz3b+ABdheWhEKN6QtVbMhqucPp2yfh+4Hs13IVuNcCny09lp64S5Avk83HDdj yCKEkJgflIp/8r8SBvTlgDhUGhfu3jH1OM3KhysfKAg/QpzTVg8Et1vv+jcuk5N0BFf9 CPL5a3spbamBqsy6I28h+LuNJixjyHNwb3dmG+422PmTTBg9Rbv83CC0ep9sLVnO3tTw SMRSiFIqyos5cvMy7wOFzhb0dM3hUANb2bYs8r9Rowm9rMtw8yCnBgz4z+VtTlF/m30B MrwA== X-Gm-Message-State: APjAAAXcC3vFsmTriRFGzGFXXruwUZWABTMOASHsELUtHjevHxpoIVoh lFJxKalsqGsn5OehldfRAMy6LA== X-Received: by 2002:a62:1bd5:: with SMTP id b204mr32075355pfb.14.1566347243386; Tue, 20 Aug 2019 17:27:23 -0700 (PDT) Received: from localhost ([172.19.216.18]) by smtp.gmail.com with ESMTPSA id x24sm19622445pgl.84.2019.08.20.17.27.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Aug 2019 17:27:22 -0700 (PDT) Date: Tue, 20 Aug 2019 20:27:05 -0400 From: Joel Fernandes To: "Paul E. McKenney" Cc: linux-kernel@vger.kernel.org, byungchul.park@lge.com, Davidlohr Bueso , Josh Triplett , kernel-team@android.com, kernel-team@lge.com, Lai Jiangshan , Mathieu Desnoyers , max.byungchul.park@gmail.com, Rao Shoaib , rcu@vger.kernel.org, Steven Rostedt Subject: Re: [PATCH v4 2/2] rcuperf: Add kfree_rcu() performance Tests Message-ID: <20190821002705.GA212946@google.com> References: <20190814160411.58591-1-joel@joelfernandes.org> <20190814160411.58591-2-joel@joelfernandes.org> <20190814225850.GZ28441@linux.ibm.com> <20190819193327.GF117548@google.com> <20190819222330.GH28441@linux.ibm.com> <20190819235123.GA185164@google.com> <20190820025056.GL28441@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190820025056.GL28441@linux.ibm.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 19, 2019 at 07:50:56PM -0700, Paul E. McKenney wrote: > > > > > > + do { > > > > > > + for (i = 0; i < kfree_alloc_num; i++) { > > > > > > + alloc_ptrs[i] = kmalloc(sizeof(struct kfree_obj), GFP_KERNEL); > > > > > > + if (!alloc_ptrs[i]) > > > > > > + return -ENOMEM; > > > > > > + } > > > > > > + > > > > > > + for (i = 0; i < kfree_alloc_num; i++) { > > > > > > + if (!kfree_no_batch) { > > > > > > + kfree_rcu(alloc_ptrs[i], rh); > > > > > > + } else { > > > > > > + rcu_callback_t cb; > > > > > > + > > > > > > + cb = (rcu_callback_t)(unsigned long)offsetof(struct kfree_obj, rh); > > > > > > + kfree_call_rcu_nobatch(&(alloc_ptrs[i]->rh), cb); > > > > > > + } > > > > > > + } > > > > > > > > > > The point of allocating a large batch and then kfree_rcu()ing them in a > > > > > loop is to defeat the per-CPU pool optimization? Either way, a comment > > > > > would be very good! > > > > > > > > It was a reasoning like this, added it as a comment: > > > > > > > > /* While measuring kfree_rcu() time, we also end up measuring kmalloc() > > > > * time. So the strategy here is to do a few (kfree_alloc_num) number > > > > * of kmalloc() and kfree_rcu() every loop so that the current loop's > > > > * deferred kfree()ing overlaps with the next loop's kmalloc(). > > > > */ > > > > > > The thought being that the CPU will be executing the two loops > > > concurrently? Up to a point, agreed, but how much of an effect is > > > that, really? > > > > Yes it may not matter much. It was just a small thought when I added the > > loop, I had to start somewhere, so I did it this way. > > > > > Or is the idea to time the kfree_rcu() loop separately? (I don't see > > > any such separate timing, though.) > > > > The kmalloc() times are included within the kfree loop. The timing of > > kfree_rcu() is not separate in my patch. > > You lost me on this one. What happens when you just interleave the > kmalloc() and kfree_rcu(), without looping, compared to the looping > above? Does this get more expensive? Cheaper? More vulnerable to OOM? > Something else? You mean pairing a single kmalloc() with a single kfree_rcu() and doing this several times? The results are very similar to doing kfree_alloc_num kmalloc()s, then do kfree_alloc_num kfree_rcu()s; and repeat the whole thing kfree_loops times (as done by this rcuperf patch we are reviewing). Following are some numbers. One change is the case where we are not at all batching does seem to complete even faster when we fully interleave kmalloc() with kfree() while the case of batching in the same scenario completes at the same time as did the "not fully interleaved" scenario. However, the grace period reduction improvements and the chances of OOM'ing are pretty much the same in either case. Fully interleaved, single kmalloc followed by kfree_rcu, do this kfree_alloc_num * kfree_loops times. ======================= (1) Batching rcuperf.kfree_loops=20000 rcuperf.kfree_alloc_num=8000 rcuperf.kfree_no_batch=0 rcuperf.kfree_rcu_test=1 root@(none):/# free -m total used free shared buff/cache available Mem: 977 261 675 0 39 674 [ 15.635620] Total time taken by all kfree'ers: 14255673998 ns, loops: 20000, batches: 1596 (2) No Batching rcuperf.kfree_loops=20000 rcuperf.kfree_alloc_num=8000 rcuperf.kfree_no_batch=1 rcuperf.kfree_rcu_test=1 root@(none):/# free -m total used free shared buff/cache available Mem: 977 67 870 0 39 869 Swap: 0 0 0 [ 12.365872] Total time taken by all kfree'ers: 10902137101 ns, loops: 20000, batches: 6893 Not fully interleaved: do kfree_alloc_num kmallocs, then do kfree_alloc_num kfree_rcu()s. And repeat this kfree_loops times. ======================= (1) Batching rcuperf.kfree_loops=20000 rcuperf.kfree_alloc_num=8000 rcuperf.kfree_no_batch=0 rcuperf.kfree_rcu_test=1 root@(none):/# free -m total used free shared buff/cache available Mem: 977 251 686 0 39 684 Swap: 0 0 0 [ 15.574402] Total time taken by all kfree'ers: 14185970787 ns, loops: 20000, batches: 1548 (2) No Batching rcuperf.kfree_loops=20000 rcuperf.kfree_alloc_num=8000 rcuperf.kfree_no_batch=1 rcuperf.kfree_rcu_test=1 root@(none):/# free -m total used free shared buff/cache available Mem: 977 82 855 0 39 853 Swap: 0 0 0 [ 13.724554] Total time taken by all kfree'ers: 12246217291 ns, loops: 20000, batches: 7262 thanks, - Joel