Received: by 2002:a25:683:0:0:0:0:0 with SMTP id 125csp603575ybg; Wed, 3 Jun 2020 08:53:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxVwAU8JMmc4C8uoB/BB+4CxByKT807hZUPMlbSk/zeM1bX4vI2QKCcSJwbFjRunzSFsvIG X-Received: by 2002:a05:6402:228d:: with SMTP id cw13mr108326edb.150.1591199621109; Wed, 03 Jun 2020 08:53:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1591199621; cv=none; d=google.com; s=arc-20160816; b=vmEAJKXTMSS6YpJ7AeYRSuI+pJquX0CrTXwFRS1AunIocygk4f/NsJN47jDDVWVIM7 oRze+1kGOyrFA0bvYJ4QRNFxSdbHVUAuWmBJYh6J6u5ODy1nm7sFQIgCBVlDV7rHFOYf Hqq6Oqx69q6Ahg+VHYrn3SzxkioXToGJuKPaV3hBmKC3oECgNiaT/V85JiEVqYHRr0ys N0t9zqDow/dsW80lcyB9KbfRXX+zBSBIyscrCIPgclCXbTMxZkINEo9Qoe0bDa3gLf8z UaI5fD1mGbN35jCHbyKX1G5EUV8rsvUff6Gfg0p0T2T+PEWt72ZnxuQ1q9e0yM5Ebqbm DjvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=Oh16TpsnMGBg8pCU4PYqMd8CcQZ4GQ8xXeP/SkEBSmU=; b=P/Tk2Pa69pIh0vwYKdQbtAiGnOMaWsh+KQK+64N7AjchSp/5CbXPBr/YmMpDRxsp/F mXk5Y4rNN/E6EF7Eh268gaem65K6Knuw0Pvl4tNPJBh9u4dbGMr0/EQ2lmeocVahy+q0 SuAoH+q/jSTydhz32F3FoPn0Y0qlbGfFTwxfi745pobn/nOUeBs2Wfsxfndm2MytP85d 81Mp6239TIS2oDyMBrfYwP5Zo2/1eZMWX6o4z5pr/Se1Mn9uiLeRQQmQknDpRp99EZZ1 +mLq9N7hL5wI5vSrtVT5jkCGQ1xh9ToP0hAKLkQ7eNDcjoP9Qu2Yhe25f9QJsDlKiydx lZtw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=sony.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n7si1289805edt.65.2020.06.03.08.53.17; Wed, 03 Jun 2020 08:53:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=sony.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725937AbgFCPvM convert rfc822-to-8bit (ORCPT + 99 others); Wed, 3 Jun 2020 11:51:12 -0400 Received: from seldsegrel01.sonyericsson.com ([37.139.156.29]:14419 "EHLO SELDSEGREL01.sonyericsson.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725867AbgFCPvM (ORCPT ); Wed, 3 Jun 2020 11:51:12 -0400 Subject: Re: [tip: core/rcu] rcu/tree: Add a shrinker to prevent OOM due to kfree_rcu() batching To: , CC: , "Joel Fernandes (Google)" , "Paul E. McKenney" , x86 References: <158923077980.390.281247872169365012.tip-bot2@tip-bot2> From: peter enderborg Message-ID: <49168aa9-4f3a-e602-edd4-98e8b0138b0b@sony.com> Date: Wed, 3 Jun 2020 17:51:08 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 In-Reply-To: <158923077980.390.281247872169365012.tip-bot2@tip-bot2> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8BIT Content-Language: en-GB X-SEG-SpamProfiler-Analysis: v=2.3 cv=VdGJw2h9 c=1 sm=1 tr=0 a=Jtaq2Av1iV2Yg7i8w6AGMw==:117 a=IkcTkHD0fZMA:10 a=nTHF0DUjJn0A:10 a=VwQbUJbxAAAA:8 a=qqdB56dbAAAA:8 a=pGLkceISAAAA:8 a=sI8eI_dASrvGnoftpIIA:9 a=jr1gfDpAJBc7VHt3:21 a=mf2Kd_Fz0wj1fc-T:21 a=QEXdDO2ut3YA:10 a=AjGcO6oz07-iQ99wixmX:22 a=ccaIO3UgQCpleZvgly2v:22 X-SEG-SpamProfiler-Score: 0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/11/20 10:59 PM, tip-bot2 for Joel Fernandes (Google) wrote: > The following commit has been merged into the core/rcu branch of tip: > > Commit-ID: 9154244c1ab6c9db4f1f25ac8f73bd46dba64287 > Gitweb: https://git.kernel.org/tip/9154244c1ab6c9db4f1f25ac8f73bd46dba64287 > Author: Joel Fernandes (Google) > AuthorDate: Mon, 16 Mar 2020 12:32:27 -04:00 > Committer: Paul E. McKenney > CommitterDate: Mon, 27 Apr 2020 11:02:50 -07:00 > > rcu/tree: Add a shrinker to prevent OOM due to kfree_rcu() batching > > To reduce grace periods and improve kfree() performance, we have done > batching recently dramatically bringing down the number of grace periods > while giving us the ability to use kfree_bulk() for efficient kfree'ing. > > However, this has increased the likelihood of OOM condition under heavy > kfree_rcu() flood on small memory systems. This patch introduces a > shrinker which starts grace periods right away if the system is under > memory pressure due to existence of objects that have still not started > a grace period. > > With this patch, I do not observe an OOM anymore on a system with 512MB > RAM and 8 CPUs, with the following rcuperf options: > > rcuperf.kfree_loops=20000 rcuperf.kfree_alloc_num=8000 > rcuperf.kfree_rcu_test=1 rcuperf.kfree_mult=2 > > Otherwise it easily OOMs with the above parameters. > > NOTE: > 1. On systems with no memory pressure, the patch has no effect as intended. > 2. In the future, we can use this same mechanism to prevent grace periods > from happening even more, by relying on shrinkers carefully. > > Cc: urezki@gmail.com > Signed-off-by: Joel Fernandes (Google) > Signed-off-by: Paul E. McKenney > --- > kernel/rcu/tree.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 60 insertions(+) > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index 156ac8d..e299cd0 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -2824,6 +2824,8 @@ struct kfree_rcu_cpu { > struct delayed_work monitor_work; > bool monitor_todo; > bool initialized; > + // Number of objects for which GP not started > + int count; Isn't it better with a atomic counter to avoid the irq handling  in shrink_count? > }; > > static DEFINE_PER_CPU(struct kfree_rcu_cpu, krc); > @@ -2937,6 +2939,8 @@ static inline bool queue_kfree_rcu_work(struct kfree_rcu_cpu *krcp) > krcp->head = NULL; > } > > + krcp->count = 0; > + > /* > * One work is per one batch, so there are two "free channels", > * "bhead_free" and "head_free" the batch can handle. It can be > @@ -3073,6 +3077,8 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func) > krcp->head = head; > } > > + krcp->count++; > + > // Set timer to drain after KFREE_DRAIN_JIFFIES. > if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING && > !krcp->monitor_todo) { > @@ -3087,6 +3093,58 @@ unlock_return: > } > EXPORT_SYMBOL_GPL(kfree_call_rcu); > > +static unsigned long > +kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc) > +{ > + int cpu; > + unsigned long flags, count = 0; > + > + /* Snapshot count of all CPUs */ > + for_each_online_cpu(cpu) { > + struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu); > + > + spin_lock_irqsave(&krcp->lock, flags); > + count += krcp->count; > + spin_unlock_irqrestore(&krcp->lock, flags); > + } > + > + return count; > +} > + > +static unsigned long > +kfree_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) > +{ > + int cpu, freed = 0; > + unsigned long flags; > + > + for_each_online_cpu(cpu) { > + int count; > + struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu); > + > + count = krcp->count; inside the lock held > + spin_lock_irqsave(&krcp->lock, flags); > + if (krcp->monitor_todo) > + kfree_rcu_drain_unlock(krcp, flags); > + else > + spin_unlock_irqrestore(&krcp->lock, flags); > + > + sc->nr_to_scan -= count; > + freed += count; > + > + if (sc->nr_to_scan <= 0) > + break; > + } > + > + return freed; > +} > + > +static struct shrinker kfree_rcu_shrinker = { > + .count_objects = kfree_rcu_shrink_count, > + .scan_objects = kfree_rcu_shrink_scan, > + .batch = 0, > + .seeks = DEFAULT_SEEKS, > +}; > + > void __init kfree_rcu_scheduler_running(void) > { > int cpu; > @@ -4007,6 +4065,8 @@ static void __init kfree_rcu_batch_init(void) > INIT_DELAYED_WORK(&krcp->monitor_work, kfree_rcu_monitor); > krcp->initialized = true; > } > + if (register_shrinker(&kfree_rcu_shrinker)) > + pr_err("Failed to register kfree_rcu() shrinker!\n"); > } > > void __init rcu_init(void)