Received: by 2002:a25:ca44:0:0:0:0:0 with SMTP id a65csp15647ybg; Mon, 27 Jul 2020 14:11:08 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwhhR1SLo6B8A9Aw19hmcA/UMzykWiN/eWiVUNOVjyXDIfblj890gNR6kCnjXJHFU0aXLQS X-Received: by 2002:a05:6402:7cf:: with SMTP id u15mr14873496edy.360.1595884267892; Mon, 27 Jul 2020 14:11:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1595884267; cv=none; d=google.com; s=arc-20160816; b=subujUJ73qJwLReZBtg4kBzo25F5oZGOz4uG6S1X62Int7vkavVlgQJQGpG3aV+US0 faRr2/TA8meUeGrAHZ6vp8JbzJNOAHP9KZjasof1P+w1+zihy0vLvo54MsMnaueB+N5g OSvAgsPdafT5AiIbkGIiTjNhhM1IF2o00kXFovkh893shIPsk2xYdtRhZtsRd4qnkaPg KbcEsVgKZwvZqyL2BDCYTF7xs53nvoRBqrpalWZU/5Je8gSVLzx07BP0zTRU2x8tzFgj fPAJG2Bluw/SjLolYLaC0BP4w42b9DRVKCEq9lP2j2Ko4v3UQBuouxycfrecrA/47wU2 IzPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=gCi4uQupGH9wlbYDIIaXDuhHkIJkqnW5jRCmhH4yzys=; b=IT1XyksECH8Xn9JJWe/7WCilB8AaChzo2MYVkmya2SI1nVrdY3G7ChwmgiNi4DdJul MDdtro+m4WOqkN7x14CqyhWT/3S+gv6gAE9DQx1UhZCRvR+CrgY1TaSh8OuGmKiddR8y EuaFq4Pb5fPrgqmd8zbYTqroWOpr5RqbwVA7DvbeyiZRFOgR8DKxQfQ1Hmc3k6jBzK/Q v6D4xL+1xQP44eIvUIFXNMM9g1K6DgWno3KkWar6EzRu7hLv3NSv2JhYwy+bLoGWmyPa jZr1axn4Ar2sNJ7Xpv4REbVrT6dzsfeEwFC4quu4xuIxBUaUz0oQBsHsOBlEkQwCbgyq 9dxA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=iP+7r2dt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j2si6176793edp.166.2020.07.27.14.10.45; Mon, 27 Jul 2020 14:11:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=iP+7r2dt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726719AbgG0VKZ (ORCPT + 99 others); Mon, 27 Jul 2020 17:10:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44536 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726140AbgG0VKW (ORCPT ); Mon, 27 Jul 2020 17:10:22 -0400 Received: from mail-lf1-x144.google.com (mail-lf1-x144.google.com [IPv6:2a00:1450:4864:20::144]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 45611C061794; Mon, 27 Jul 2020 14:10:22 -0700 (PDT) Received: by mail-lf1-x144.google.com with SMTP id i80so9772396lfi.13; Mon, 27 Jul 2020 14:10:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=gCi4uQupGH9wlbYDIIaXDuhHkIJkqnW5jRCmhH4yzys=; b=iP+7r2dt0mrERD07SdS+TGzP3SyiL1AoMitm778mxLBgJZrySPoHh14xUJfzeLz3EW daetsozIsLIZLGcVVraW3Pmc5VAuA2+odgfJLWBYul6shRJ9hh0Qgp/svRTItr1aTWTe RxWRKJCIdxzKFXV/QM9dPDBoGbmKArmISv250T5rYezYh69L6dIp0+NntO7tF00HJwr2 R1KkIYCfuDYF3NKEpywJJuDXrFTbw83C9pNgaChmhw5v05WuAFe1eTxpXzAkoMG+dPLP DsRsoV7FL6PuFxs1I+S81exnBP7IBPH+DBS7qPlw8yaNONrobm9F0AqURFY18yLGtnSC k0lA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=gCi4uQupGH9wlbYDIIaXDuhHkIJkqnW5jRCmhH4yzys=; b=njxqLsWRPkpAbi7HqUA+39SJw/UM+zx0VFlLuQmO2qfQJgVAR24z07NOzpdl0QhhN9 alvBrpWCBwjmzoEO2B43x/HHlNOzR9LyWdaRTryyP7T+rkN0LBt+6zVYbW9TfvyEVor7 8QXeoPKj1sKSnow6hQTXZElda3DoaqUT/oE6QRIBiLpnjt8iYYrTVWf6rwcybLM63gha YDvnGpNQFnQPdNg4I4ThDRVWzYV67GtKeSPSRXa6r5oGfB5cuaPKE2bF+ExvSHqAvIqn zD0nbxZHVRQIjZDAXq75wtAuxRUmv3KY12yh+HnLsliE38EZt8NJ21MJr4e1KtP7WozL uYEQ== X-Gm-Message-State: AOAM532jd9m04i0YoTwEKoIINIlMyAM7BjYLQD1mf4E2vzLD/9fiyREg 98M0r6INbzg20mkH5iUc3w4icDii850= X-Received: by 2002:a05:6512:10ca:: with SMTP id k10mr12911706lfg.177.1595884220130; Mon, 27 Jul 2020 14:10:20 -0700 (PDT) Received: from pc638.lan (h5ef52e31.seluork.dyn.perspektivbredband.net. [94.245.46.49]) by smtp.gmail.com with ESMTPSA id s1sm3266078lfi.76.2020.07.27.14.10.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Jul 2020 14:10:19 -0700 (PDT) From: "Uladzislau Rezki (Sony)" To: LKML , RCU , linux-mm@kvack.org, "Paul E . McKenney" Cc: Andrew Morton , "Theodore Y . Ts'o" , Matthew Wilcox , Joel Fernandes , Sebastian Andrzej Siewior , Uladzislau Rezki , Oleksiy Avramchenko Subject: [PATCH v2 1/1] rcu/tree: Drop the lock before entering to page allocator Date: Mon, 27 Jul 2020 23:10:12 +0200 Message-Id: <20200727211012.30948-1-urezki@gmail.com> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org If the kernel is built with CONFIG_PROVE_RAW_LOCK_NESTING option, the lockedp will complain about violation of the nesting rules: [ 28.060389] ============================= [ 28.060389] [ BUG: Invalid wait context ] [ 28.060389] 5.8.0-rc3-rcu #211 Tainted: G E [ 28.060389] ----------------------------- [ 28.060390] vmalloc_test/0/523 is trying to lock: [ 28.060390] ffff96df7ffe0228 (&zone->lock){-.-.}-{3:3}, at: get_page_from_freelist+0xcf0/0x16d0 [ 28.060391] other info that might help us debug this: [ 28.060391] context-{5:5} [ 28.060392] 2 locks held by vmalloc_test/0/523: [ 28.060392] #0: ffffffffc06750d0 (prepare_for_test_rwsem){++++}-{4:4}, at: test_func+0x76/0x240 [test_vmalloc] [ 28.060393] #1: ffff96df5fa1d390 (krc.lock){..-.}-{2:2}, at: kvfree_call_rcu+0x5c/0x230 [ 28.060395] stack backtrace: [ 28.060395] CPU: 0 PID: 523 Comm: vmalloc_test/0 Tainted: G E 5.8.0-rc3-rcu #211 [ 28.060395] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 [ 28.060396] Call Trace: [ 28.060397] dump_stack+0x96/0xd0 [ 28.060397] __lock_acquire.cold.65+0x166/0x2d7 [ 28.060398] ? find_held_lock+0x2d/0x90 [ 28.060399] lock_acquire+0xad/0x370 [ 28.060400] ? get_page_from_freelist+0xcf0/0x16d0 [ 28.060401] ? mark_held_locks+0x48/0x70 [ 28.060402] _raw_spin_lock+0x25/0x30 [ 28.060403] ? get_page_from_freelist+0xcf0/0x16d0 [ 28.060404] get_page_from_freelist+0xcf0/0x16d0 [ 28.060405] ? __lock_acquire+0x3ee/0x1b90 [ 28.060407] __alloc_pages_nodemask+0x16a/0x3a0 [ 28.060408] __get_free_pages+0xd/0x30 [ 28.060409] kvfree_call_rcu+0x18a/0x230 Internally the kfree_rcu() uses raw_spinlock_t whereas the page allocator internally deals with spinlock_t to access to its zones. In order to prevent such vialation that is in question we can drop the internal raw_spinlock_t before entering to the page allocaor. Short changelog (v1 -> v2): - rework the commit message; - rework the patch making it smaller; - add more comments. Signed-off-by: Uladzislau Rezki (Sony) --- kernel/rcu/tree.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 21c2fa5bd8c3..2de112404121 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3287,6 +3287,8 @@ kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr) return false; lockdep_assert_held(&krcp->lock); + lockdep_assert_irqs_disabled(); + idx = !!is_vmalloc_addr(ptr); /* Check if a new block is required. */ @@ -3306,6 +3308,29 @@ kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr) if (IS_ENABLED(CONFIG_PREEMPT_RT)) return false; + /* + * If built with CONFIG_PROVE_RAW_LOCK_NESTING option, + * the lockedp will complain about violation of the + * nesting rules. It does the raw_spinlock vs. spinlock + * nesting checks. + * + * That is why we drop the raw lock. Please note IRQs are + * still disabled it guarantees that the "current" stays + * on the same CPU later on when the raw lock is taken + * back. + * + * It is important because if the page allocator is invoked + * in fully preemptible context, it can be that we get a page + * but end up on another CPU. That another CPU might not need + * a page because of having some extra spots in its internal + * array for pointer collecting. Staying on same CPU eliminates + * described issue. + * + * Dropping the lock also reduces the critical section by + * the time taken by the page allocator to obtain a page. + */ + raw_spin_unlock(&krcp->lock); + /* * NOTE: For one argument of kvfree_rcu() we can * drop the lock and get the page in sleepable @@ -3315,6 +3340,8 @@ kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr) */ bnode = (struct kvfree_rcu_bulk_data *) __get_free_page(GFP_NOWAIT | __GFP_NOWARN); + + raw_spin_lock(&krcp->lock); } /* Switch to emergency path. */ -- 2.20.1