Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp2780079ybh; Mon, 16 Mar 2020 09:33:12 -0700 (PDT) X-Google-Smtp-Source: ADFU+vuKL8gifuFOTME55pJTcYH0AelAYPTuU9hsBJlvOIkGJBG8HN66ew+yQ7XSphq5cEQYstib X-Received: by 2002:aca:4fc7:: with SMTP id d190mr305813oib.100.1584376392363; Mon, 16 Mar 2020 09:33:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584376392; cv=none; d=google.com; s=arc-20160816; b=v+kAZ14uTptpRPP+d0GYlzEMwKVbRT4JWbf0yutIasYRQeZ6D5f2WbxtfM9yV26TqC g4R+zNNQ4YPXbTzal2nDT2yWkkkmBjZPv6N8EVIdtrigQFluzN2D4+1aLxFE6qujh1lq 7Y10uKvdKzlwAxuasWtnvWgPYtQ4J5NhZGhw8XdC7up8vha9jjnOO+mkld+rZ8EQUZTj 2XI7S4Nhr9vO3yBYHGAEw1r4yvngWRMaBJiIi4lBbEyC8XnuCfViI8OqtO/HyMgNpvYM JA7LLbTv2YPf9auAFk1aaCh0grwpuwbnbZk6PAnf0pAhm3QevGqICK16QMRSHkOr4Z1D Wg4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=WnVlBAntxMU0MXZ+RmdU8L+mrOyKQNUKTy7ZQi+XdC8=; b=BqIraoa9b0eMZ8HX6HrbiKVEwo0CYgk033kYg9TYwyXoH23Y+faywZ8Piw0h+knrjl b0hlQRCmyEWT7Ei93+eeYeSpuIBe5zUoWU3kG/TCOcJcK1OacoUM2pTk2sWfPntgMaq6 hfaw2gvUiIDOxn4zQLuh9t0eeh9lQoJjKINAT56jAW5bkSao2dYAgVrjPj63NkgDqZi0 w5fvMlfaQiMLGdyDSJWb7FjEEC7Pmr/2/0YLjcRCX97CP5hQOpnYabBbph0t5ION5arW yt2qPeijbZURWnbAHz2lSA9uPO9UnnBO941256YdfYwo95/w03AyiMQyi1C2zYueaWfF VPeg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=yRrupE0w; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g11si181075otq.309.2020.03.16.09.32.58; Mon, 16 Mar 2020 09:33:12 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=yRrupE0w; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732163AbgCPQch (ORCPT + 99 others); Mon, 16 Mar 2020 12:32:37 -0400 Received: from mail-qt1-f194.google.com ([209.85.160.194]:33702 "EHLO mail-qt1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732154AbgCPQcg (ORCPT ); Mon, 16 Mar 2020 12:32:36 -0400 Received: by mail-qt1-f194.google.com with SMTP id d22so14758779qtn.0 for ; Mon, 16 Mar 2020 09:32:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=WnVlBAntxMU0MXZ+RmdU8L+mrOyKQNUKTy7ZQi+XdC8=; b=yRrupE0w4MSh9gGyHF9e1ziDGwNrGhLF/dLaUnvrWu11MtqF4IxFB4BWiIfH57h59M 6xYdwrzmqFWONkPJ3KDHCqKTvhDY1v6xpCIosPc1Z2krUhFFl+Zu6Y8T2FA6C6LS8fsd JHxM6Ma3AC4GWgmyuSvIFJqUakGi7RTMEUYU0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=WnVlBAntxMU0MXZ+RmdU8L+mrOyKQNUKTy7ZQi+XdC8=; b=qNRBqMDBktIjzMp6ZIokZ+AEmozcb3whoIHXtla/qHp+8rFsrfcG5IjLkT9g+BGRlA N9uJq6cFGO6cyP2a7hHRyPyVTayc91Cm7k6f8vxH5sH4Ojjdro1Y4LMyV/HBV3afRqnW q+VcrCY9k4Rn3JIJ1rM2U2UdT4S9KQ+PWaAQNS26npmp3Aa9zf0Fbqc6nSfyMEMlfzTK R2IULLhFZMq4EonUQF57XkRA/TBB0baDOBYM/9OhUC0H7riBo6KNFtFqGzWvZHWqUAjc 6KK+vc3SpWLqmucajOB1LoFVvyrN8PisRSvxDFFCN6tbF3HsOcER7Rc+AhY/iUR+anl9 du3g== X-Gm-Message-State: ANhLgQ1tn+o3Xg6Tohifmm9bP1E0K9ybbYOZS++roEMcHyq3Me0f/Edf jZ1+tuoKND7fq8W3SAMycW6mQJ/7e2c= X-Received: by 2002:ac8:1194:: with SMTP id d20mr903885qtj.243.1584376354881; Mon, 16 Mar 2020 09:32:34 -0700 (PDT) Received: from joelaf.cam.corp.google.com ([2620:15c:6:12:9c46:e0da:efbf:69cc]) by smtp.gmail.com with ESMTPSA id y127sm84139qkb.76.2020.03.16.09.32.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Mar 2020 09:32:34 -0700 (PDT) From: "Joel Fernandes (Google)" To: linux-kernel@vger.kernel.org Cc: "Joel Fernandes (Google)" , urezki@gmail.com, Davidlohr Bueso , Josh Triplett , Lai Jiangshan , Mathieu Desnoyers , "Paul E. McKenney" , rcu@vger.kernel.org, Steven Rostedt Subject: [PATCH v2 rcu-dev 2/3] rcu/tree: Add a shrinker to prevent OOM due to kfree_rcu() batching Date: Mon, 16 Mar 2020 12:32:27 -0400 Message-Id: <20200316163228.62068-2-joel@joelfernandes.org> X-Mailer: git-send-email 2.25.1.481.gfbce0eb801-goog In-Reply-To: <20200316163228.62068-1-joel@joelfernandes.org> References: <20200316163228.62068-1-joel@joelfernandes.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org To reduce grace periods and improve kfree() performance, we have done batching recently dramatically bringing down the number of grace periods while giving us the ability to use kfree_bulk() for efficient kfree'ing. However, this has increased the likelihood of OOM condition under heavy kfree_rcu() flood on small memory systems. This patch introduces a shrinker which starts grace periods right away if the system is under memory pressure due to existence of objects that have still not started a grace period. With this patch, I do not observe an OOM anymore on a system with 512MB RAM and 8 CPUs, with the following rcuperf options: rcuperf.kfree_loops=20000 rcuperf.kfree_alloc_num=8000 rcuperf.kfree_rcu_test=1 rcuperf.kfree_mult=2 Otherwise it easily OOMs with the above parameters. NOTE: 1. On systems with no memory pressure, the patch has no effect as intended. 2. In the future, we can use this same mechanism to prevent grace periods from happening even more, by relying on shrinkers carefully. Cc: urezki@gmail.com Signed-off-by: Joel Fernandes (Google) --- kernel/rcu/tree.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 60 insertions(+) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 4a885af2ff73e..dc570dff68d7b 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2801,6 +2801,8 @@ struct kfree_rcu_cpu { struct delayed_work monitor_work; bool monitor_todo; bool initialized; + // Number of objects for which GP not started + int count; }; static DEFINE_PER_CPU(struct kfree_rcu_cpu, krc); @@ -2914,6 +2916,8 @@ static inline bool queue_kfree_rcu_work(struct kfree_rcu_cpu *krcp) krcp->head = NULL; } + krcp->count = 0; + /* * One work is per one batch, so there are two "free channels", * "bhead_free" and "head_free" the batch can handle. It can be @@ -3050,6 +3054,8 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func) krcp->head = head; } + krcp->count++; + // Set timer to drain after KFREE_DRAIN_JIFFIES. if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING && !krcp->monitor_todo) { @@ -3064,6 +3070,58 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func) } EXPORT_SYMBOL_GPL(kfree_call_rcu); +static unsigned long +kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc) +{ + int cpu; + unsigned long flags, count = 0; + + /* Snapshot count of all CPUs */ + for_each_online_cpu(cpu) { + struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu); + + spin_lock_irqsave(&krcp->lock, flags); + count += krcp->count; + spin_unlock_irqrestore(&krcp->lock, flags); + } + + return count; +} + +static unsigned long +kfree_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) +{ + int cpu, freed = 0; + unsigned long flags; + + for_each_online_cpu(cpu) { + int count; + struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu); + + count = krcp->count; + spin_lock_irqsave(&krcp->lock, flags); + if (krcp->monitor_todo) + kfree_rcu_drain_unlock(krcp, flags); + else + spin_unlock_irqrestore(&krcp->lock, flags); + + sc->nr_to_scan -= count; + freed += count; + + if (sc->nr_to_scan <= 0) + break; + } + + return freed; +} + +static struct shrinker kfree_rcu_shrinker = { + .count_objects = kfree_rcu_shrink_count, + .scan_objects = kfree_rcu_shrink_scan, + .batch = 0, + .seeks = DEFAULT_SEEKS, +}; + void __init kfree_rcu_scheduler_running(void) { int cpu; @@ -3981,6 +4039,8 @@ static void __init kfree_rcu_batch_init(void) INIT_DELAYED_WORK(&krcp->monitor_work, kfree_rcu_monitor); krcp->initialized = true; } + if (register_shrinker(&kfree_rcu_shrinker)) + pr_err("Failed to register kfree_rcu() shrinker!\n"); } void __init rcu_init(void) -- 2.25.1.481.gfbce0eb801-goog