Received: by 2002:a25:868d:0:0:0:0:0 with SMTP id z13csp2178309ybk; Mon, 11 May 2020 14:02:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx7N1frquNXlXNKJwTratTIdi5JoxDZjRe95RBCFvQDILzWdUtCLIi0fxXnysqUKlCysbo+ X-Received: by 2002:a50:9e47:: with SMTP id z65mr2723178ede.261.1589230975155; Mon, 11 May 2020 14:02:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589230975; cv=none; d=google.com; s=arc-20160816; b=ZuZ8Xu8ark+GqSgyGpSKKPNG2Tjgzjw4GyjI/9F4f7GMOTeX3cgfYtWQ28dwWzIpHq Pr+ZeY4Gz4uRp7114OLmo0Xby3cUXkAKlsW395spfwF7x9XYsLiiJ9sUetfFNT+9kGKp epjwlt8HIUB64v8a2H4V6zPWq1hUB4O5SyzStszfV0fZFHt2G5zkAJ9yUTd9Dgz70MZO NSU5NbIoiLuU2E06DuykpHuY3r403PoRJQIVTgSLZA49n/sRAzV1yuL/cla7JibbZHYb h8DHfEpjIj29frk9TU8jVWi//bg5NiGVqBDh0LSkBlH03HkRnUK/oDTHK97mXSKjmS9N EjQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :robot-unsubscribe:robot-id:message-id:mime-version:cc:subject:to :reply-to:from:date; bh=fyz5sJfdSFUvq2Ci2aixtAz8e35VpVR/XVbKKK5Z+54=; b=oDkuaIAYjdPHfd9vtvGLmBxHI8Jzb5+wqfG3GhjLK4EX/4IEvhWCXhWhU3cvBchFQM oSZSnkxAfLADdhVeaeSa/dBjr4/NNIuYDhZhziBZ2jU72vZMO11e3VrbxNGhvmy8dwcH EuWgOqdATslbnsMT/i8lwWtpZJChMzK8jaIAA2mYswEc4UTpcwbb33n5/xWHPBtv70ua +VNciqc22TaWkPVRuUUxzmPWj2yNBR4hQ+HH505soM3VvirZM6mx1xz8VIE8JgMsUZti RDWoA54Cw+fJfw8frfDPQiXo1bdPEPXhrv9hZhAwqcjJ34Vdjsgsd8o84jMeS54zU5/N zPPA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c21si3728614ejr.277.2020.05.11.14.02.29; Mon, 11 May 2020 14:02:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732180AbgEKVA1 (ORCPT + 99 others); Mon, 11 May 2020 17:00:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44120 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1729971AbgEKVAA (ORCPT ); Mon, 11 May 2020 17:00:00 -0400 Received: from Galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 26702C061A0E; Mon, 11 May 2020 14:00:00 -0700 (PDT) Received: from [5.158.153.53] (helo=tip-bot2.lab.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1jYFWk-0005y6-Kh; Mon, 11 May 2020 22:59:54 +0200 Received: from [127.0.1.1] (localhost [IPv6:::1]) by tip-bot2.lab.linutronix.de (Postfix) with ESMTP id D624F1C06DA; Mon, 11 May 2020 22:59:39 +0200 (CEST) Date: Mon, 11 May 2020 20:59:39 -0000 From: "tip-bot2 for Joel Fernandes (Google)" Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: core/rcu] rcu/tree: Add a shrinker to prevent OOM due to kfree_rcu() batching Cc: urezki@gmail.com, "Joel Fernandes (Google)" , "Paul E. McKenney" , x86 , LKML MIME-Version: 1.0 Message-ID: <158923077980.390.281247872169365012.tip-bot2@tip-bot2> X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The following commit has been merged into the core/rcu branch of tip: Commit-ID: 9154244c1ab6c9db4f1f25ac8f73bd46dba64287 Gitweb: https://git.kernel.org/tip/9154244c1ab6c9db4f1f25ac8f73bd46dba64287 Author: Joel Fernandes (Google) AuthorDate: Mon, 16 Mar 2020 12:32:27 -04:00 Committer: Paul E. McKenney CommitterDate: Mon, 27 Apr 2020 11:02:50 -07:00 rcu/tree: Add a shrinker to prevent OOM due to kfree_rcu() batching To reduce grace periods and improve kfree() performance, we have done batching recently dramatically bringing down the number of grace periods while giving us the ability to use kfree_bulk() for efficient kfree'ing. However, this has increased the likelihood of OOM condition under heavy kfree_rcu() flood on small memory systems. This patch introduces a shrinker which starts grace periods right away if the system is under memory pressure due to existence of objects that have still not started a grace period. With this patch, I do not observe an OOM anymore on a system with 512MB RAM and 8 CPUs, with the following rcuperf options: rcuperf.kfree_loops=20000 rcuperf.kfree_alloc_num=8000 rcuperf.kfree_rcu_test=1 rcuperf.kfree_mult=2 Otherwise it easily OOMs with the above parameters. NOTE: 1. On systems with no memory pressure, the patch has no effect as intended. 2. In the future, we can use this same mechanism to prevent grace periods from happening even more, by relying on shrinkers carefully. Cc: urezki@gmail.com Signed-off-by: Joel Fernandes (Google) Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 60 insertions(+) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 156ac8d..e299cd0 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2824,6 +2824,8 @@ struct kfree_rcu_cpu { struct delayed_work monitor_work; bool monitor_todo; bool initialized; + // Number of objects for which GP not started + int count; }; static DEFINE_PER_CPU(struct kfree_rcu_cpu, krc); @@ -2937,6 +2939,8 @@ static inline bool queue_kfree_rcu_work(struct kfree_rcu_cpu *krcp) krcp->head = NULL; } + krcp->count = 0; + /* * One work is per one batch, so there are two "free channels", * "bhead_free" and "head_free" the batch can handle. It can be @@ -3073,6 +3077,8 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func) krcp->head = head; } + krcp->count++; + // Set timer to drain after KFREE_DRAIN_JIFFIES. if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING && !krcp->monitor_todo) { @@ -3087,6 +3093,58 @@ unlock_return: } EXPORT_SYMBOL_GPL(kfree_call_rcu); +static unsigned long +kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc) +{ + int cpu; + unsigned long flags, count = 0; + + /* Snapshot count of all CPUs */ + for_each_online_cpu(cpu) { + struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu); + + spin_lock_irqsave(&krcp->lock, flags); + count += krcp->count; + spin_unlock_irqrestore(&krcp->lock, flags); + } + + return count; +} + +static unsigned long +kfree_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) +{ + int cpu, freed = 0; + unsigned long flags; + + for_each_online_cpu(cpu) { + int count; + struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu); + + count = krcp->count; + spin_lock_irqsave(&krcp->lock, flags); + if (krcp->monitor_todo) + kfree_rcu_drain_unlock(krcp, flags); + else + spin_unlock_irqrestore(&krcp->lock, flags); + + sc->nr_to_scan -= count; + freed += count; + + if (sc->nr_to_scan <= 0) + break; + } + + return freed; +} + +static struct shrinker kfree_rcu_shrinker = { + .count_objects = kfree_rcu_shrink_count, + .scan_objects = kfree_rcu_shrink_scan, + .batch = 0, + .seeks = DEFAULT_SEEKS, +}; + void __init kfree_rcu_scheduler_running(void) { int cpu; @@ -4007,6 +4065,8 @@ static void __init kfree_rcu_batch_init(void) INIT_DELAYED_WORK(&krcp->monitor_work, kfree_rcu_monitor); krcp->initialized = true; } + if (register_shrinker(&kfree_rcu_shrinker)) + pr_err("Failed to register kfree_rcu() shrinker!\n"); } void __init rcu_init(void)