Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp6149521ybf; Thu, 5 Mar 2020 14:14:11 -0800 (PST) X-Google-Smtp-Source: ADFU+vsINRRr9BA0XW1KfLI+5OjmNNcksIyV5PpKqotlcKxu+pqrxGvIoS9e0rkIqaZlCNUunT/u X-Received: by 2002:aca:abc6:: with SMTP id u189mr415567oie.151.1583446451728; Thu, 05 Mar 2020 14:14:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583446451; cv=none; d=google.com; s=arc-20160816; b=AaiwpEdNhUTo/whJqmfL6WGUgzXkhO3/qiS6LoAjy1bDB+O0DQ1ua0ICUTxLjCc4lA 62loXKDDorcS/f07fM2evJ55v92ttQLc2Qfoy8oOPfmItzjWMDsQb5HvLeyp5NPiaNlT YruxZqaJgR7GAvGaoKHXqZVdk9rWg8I7AFTqdm5AU1BQeQcGI/N6mEyR/GyO4vGjbn/4 0JVaRun4/pTtSTrltbosABMYcxYbS64cQVJdG0e+eixpyYazeGH/cYuFcatMd+GHzTLB pkuOHzPveCEPfpaoP1mEZ52LNQ9T/wUMxM2Yz9FB8hy7jCf3/DHInPi8BVMNL9oa7gwx C6FA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Rv3YbInNF6o6Vv0DcE4gmr7uq+LUpUVYyrjz9X11Bgs=; b=TjdyK8iBbvdL+KZ3gUJfLtuCiBEhT3LH8oZQ0vfXytfXbUWgmtTs5LfZTrgblYGqjJ rSYyKEbL0Ky+6V44egesP8lw/DKgex6dV2YpJGVV6LnNehisHWYVl0e1lg2eqm908toE 0B+h49O3tb5nxAxo+O1O9XzABTFr8iMG180gYhJiSymGtxEdFgtrEkIwTs0pyujuuUKG Y2khUu0tUTitBOptm9jtDkVuR7vSSL8m2nUuEK4zR63GiSf34HLA2X5bBvelanpIwpk4 chuhgDSbZfaZbh4/zf9AXPb65oLSnduGIy0jHbRJzs6xo8uxCz1b787ZKIErWR6GvQqE rR+g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=JGMmfoL1; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d15si142596otq.158.2020.03.05.14.14.00; Thu, 05 Mar 2020 14:14:11 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=JGMmfoL1; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726490AbgCEWNl (ORCPT + 99 others); Thu, 5 Mar 2020 17:13:41 -0500 Received: from mail-qt1-f195.google.com ([209.85.160.195]:36298 "EHLO mail-qt1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726382AbgCEWNk (ORCPT ); Thu, 5 Mar 2020 17:13:40 -0500 Received: by mail-qt1-f195.google.com with SMTP id m33so325186qtb.3 for ; Thu, 05 Mar 2020 14:13:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Rv3YbInNF6o6Vv0DcE4gmr7uq+LUpUVYyrjz9X11Bgs=; b=JGMmfoL1BCXUCe9OwoHBRAOyb05vDTYl9PJ/FTSm0JwMPh8tH88ZBuVJt+GU0+9SjO iBejeX/6q/R4jPnuz3bMDlWJRWDti91/ld4lY7wATZmebDl7TBYebnwe7qAHKbH5VEpp WpuhUToRW9G0/P5lPiHja2vyjPF0aJxcOtMUc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Rv3YbInNF6o6Vv0DcE4gmr7uq+LUpUVYyrjz9X11Bgs=; b=eVDevWhC0E/cb0vBZE73owCCNMptA2OeRA39msAlxgw/sWiqqZAAKp+H2DFMNmtHVb fJ1S1chWT1jeTDPAFmmblr5ta+0QqWMR4IMsQvOwQx6KlE582+RwFPBQEBB9cMLaVcFz evRS6P5IlVoyaVlKzXuWevTNBHQGfVw3XGgRB+gt84h5bwh6XSlUV8F2ETySBNNu/CZN zOBB/zAv/L8GR1kwI7LWNV7TuePtKi0vlp4TX1JTfFqN3DKgUW4AqeG1rFpMFAlvB2FG B7sCke6CiwpEgCkUZ0qaAv4qWQOoyIsItB8vzjDXKoQqOJLY4GZugGh1O5ognd8RfI7v JfQA== X-Gm-Message-State: ANhLgQ3tp92I98NLk0OaAkskXdP8pU2FxUds77P6FVLQA7dQOXYutqtO upL6yriSqA58M274g+aiVoWEo3ONz/c= X-Received: by 2002:ac8:39c2:: with SMTP id v60mr314235qte.211.1583446418920; Thu, 05 Mar 2020 14:13:38 -0800 (PST) Received: from joelaf.cam.corp.google.com ([2620:15c:6:12:9c46:e0da:efbf:69cc]) by smtp.gmail.com with ESMTPSA id n8sm16366198qke.37.2020.03.05.14.13.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Mar 2020 14:13:38 -0800 (PST) From: "Joel Fernandes (Google)" To: linux-kernel@vger.kernel.org Cc: "Joel Fernandes (Google)" , urezki@gmail.com, Davidlohr Bueso , Josh Triplett , Lai Jiangshan , Mathieu Desnoyers , "Paul E. McKenney" , rcu@vger.kernel.org, Steven Rostedt Subject: [PATCH linus/master 2/2] rcu/tree: Add a shrinker to prevent OOM due to kfree_rcu() batching Date: Thu, 5 Mar 2020 17:13:23 -0500 Message-Id: <20200305221323.66051-2-joel@joelfernandes.org> X-Mailer: git-send-email 2.25.0.265.gbab2e86ba0-goog In-Reply-To: <20200305221323.66051-1-joel@joelfernandes.org> References: <20200305221323.66051-1-joel@joelfernandes.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org To reduce grace periods and improve kfree() performance, we have done batching recently dramatically bringing down the number of grace periods while giving us the ability to use kfree_bulk() for efficient kfree'ing. However, this has increased the likelihood of OOM condition under heavy kfree_rcu() flood on small memory systems. This patch introduces a shrinker which starts grace periods right away if the system is under memory pressure due to existence of objects that have still not started a grace period. With this patch, I do not observe an OOM anymore on a system with 512MB RAM and 8 CPUs, with the following rcuperf options: rcuperf.kfree_loops=20000 rcuperf.kfree_alloc_num=8000 rcuperf.kfree_rcu_test=1 rcuperf.kfree_mult=2 NOTE: On systems with no memory pressure, the patch has no effect as intended. Cc: urezki@gmail.com Signed-off-by: Joel Fernandes (Google) --- kernel/rcu/tree.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index d91c9156fab2e..28ec35e15529d 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -2723,6 +2723,8 @@ struct kfree_rcu_cpu { struct delayed_work monitor_work; bool monitor_todo; bool initialized; + // Number of objects for which GP not started + int count; }; static DEFINE_PER_CPU(struct kfree_rcu_cpu, krc); @@ -2791,6 +2793,7 @@ static inline bool queue_kfree_rcu_work(struct kfree_rcu_cpu *krcp) krwp->head_free = krcp->head; krcp->head = NULL; + krcp->count = 0; INIT_RCU_WORK(&krwp->rcu_work, kfree_rcu_work); queue_rcu_work(system_wq, &krwp->rcu_work); return true; @@ -2864,6 +2867,7 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func) head->func = func; head->next = krcp->head; krcp->head = head; + krcp->count++; // Set timer to drain after KFREE_DRAIN_JIFFIES. if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING && @@ -2879,6 +2883,58 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func) } EXPORT_SYMBOL_GPL(kfree_call_rcu); +static unsigned long +kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc) +{ + int cpu; + unsigned long flags, count = 0; + + /* Snapshot count of all CPUs */ + for_each_online_cpu(cpu) { + struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu); + + spin_lock_irqsave(&krcp->lock, flags); + count += krcp->count; + spin_unlock_irqrestore(&krcp->lock, flags); + } + + return count; +} + +static unsigned long +kfree_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) +{ + int cpu, freed = 0; + unsigned long flags; + + for_each_online_cpu(cpu) { + int count; + struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu); + + count = krcp->count; + spin_lock_irqsave(&krcp->lock, flags); + if (krcp->monitor_todo) + kfree_rcu_drain_unlock(krcp, flags); + else + spin_unlock_irqrestore(&krcp->lock, flags); + + sc->nr_to_scan -= count; + freed += count; + + if (sc->nr_to_scan <= 0) + break; + } + + return freed; +} + +static struct shrinker kfree_rcu_shrinker = { + .count_objects = kfree_rcu_shrink_count, + .scan_objects = kfree_rcu_shrink_scan, + .batch = 0, + .seeks = DEFAULT_SEEKS, +}; + void __init kfree_rcu_scheduler_running(void) { int cpu; @@ -3774,6 +3830,8 @@ static void __init kfree_rcu_batch_init(void) INIT_DELAYED_WORK(&krcp->monitor_work, kfree_rcu_monitor); krcp->initialized = true; } + if (register_shrinker(&kfree_rcu_shrinker)) + pr_err("Failed to register kfree_rcu() shrinker!\n"); } void __init rcu_init(void) -- 2.25.0.265.gbab2e86ba0-goog