Received: by 2002:a05:6a10:c604:0:0:0:0 with SMTP id y4csp816860pxt; Thu, 5 Aug 2021 12:23:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxEFI1QD4BpeuQsHZ6foa3wX1/LmM/6qHZatudLu3FXIWj/Hhk4F7T+Iy6YL9Vrp6kGqzbQ X-Received: by 2002:aa7:cb19:: with SMTP id s25mr8796513edt.194.1628191393268; Thu, 05 Aug 2021 12:23:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628191393; cv=none; d=google.com; s=arc-20160816; b=njoZytKOODkJQPJ2CFHPbqfQ5QibxU9cj/oYesvt5vD1291woDE1iMapmfokC6lrjO pedPhAOY8QxZXMlt2F+i055fQigPHAczezEGZS9CqQmEHzawbPE1so+zJ/We2HH6/Qnj KVxq7wPVpEOBlxvbYlS5F1Y+Mk6yt4CrWBAtsRZLaHEKtarhJw3E2qenYaGcpih+452i AdUoUDadh06v2s1GWFB7rR23Ni/RvaezyoNfdFcpm941zq1jkehWLSkTvAsex8QjSgTf 0wXOw8v/XFziHVlVMcL5SJ6dU15O5k66dIlhpTZ9rMu01HBI04H3MJ8AjlM8mqxCYUwn hA0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=AO1HseGVGBA5x7z7MuMFvczKKafgZisb4KH14tU3Mto=; b=fFvlgMxGYvWf4VjDrrg1MabQ5czTaYAkTv9OYqsVMjQ0hK6namS3MhK0Ph6z21vLu7 eD52DkDStsXIcYRScoZqhsLVr3qS2B54IQv9Tqo2SdUoYcG7Hvp31Y9g6RC6LccxLC75 q6fx6d3xAffDI6TpVPdLcMWDTYDDLXmf/v665TUTlLTkluuhiVWIz8MlfcJO69e6dFiL 6GdLIds3qeT4VVSwfLa3lGh2Cpp4cFlu2ls2VFu/5ICk9pBEp2ZHl0MIWl2l4I0N/uun BHWvTA6UsZJo8J6+4XL3WBt3jo+6kJ+zgMcs4hCl3EP0JHJBsv0nN+rt6yjP0EwloAX4 zs/Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h13si6513649edz.378.2021.08.05.12.22.48; Thu, 05 Aug 2021 12:23:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242872AbhHEQA4 (ORCPT + 99 others); Thu, 5 Aug 2021 12:00:56 -0400 Received: from outbound-smtp11.blacknight.com ([46.22.139.106]:59327 "EHLO outbound-smtp11.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242824AbhHEQAz (ORCPT ); Thu, 5 Aug 2021 12:00:55 -0400 Received: from mail.blacknight.com (pemlinmail03.blacknight.ie [81.17.254.16]) by outbound-smtp11.blacknight.com (Postfix) with ESMTPS id 7C4381C4003 for ; Thu, 5 Aug 2021 17:00:40 +0100 (IST) Received: (qmail 18016 invoked from network); 5 Aug 2021 16:00:40 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.17.255]) by 81.17.254.9 with ESMTPA; 5 Aug 2021 16:00:40 -0000 From: Mel Gorman To: Andrew Morton Cc: Thomas Gleixner , Ingo Molnar , Vlastimil Babka , Hugh Dickins , Linux-MM , Linux-RT-Users , LKML , Mel Gorman Subject: [PATCH 1/1] mm/vmstat: Protect per cpu variables with preempt disable on RT Date: Thu, 5 Aug 2021 17:00:19 +0100 Message-Id: <20210805160019.1137-2-mgorman@techsingularity.net> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210805160019.1137-1-mgorman@techsingularity.net> References: <20210805160019.1137-1-mgorman@techsingularity.net> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Ingo Molnar Disable preemption on -RT for the vmstat code. On vanila the code runs in IRQ-off regions while on -RT it may not when stats are updated under a local_lock. "preempt_disable" ensures that the same resources is not updated in parallel due to preemption. This patch differs from the preempt-rt version where __count_vm_event and __count_vm_events are also protected. The counters are explicitly "allowed to be to be racy" so there is no need to protect them from preemption. Only the accurate page stats that are updated by a read-modify-write need protection. This patch also differs in that a preempt_[en|dis]able_rt helper is not used. As vmstat is the only user of the helper, it was suggested that it be open-coded in vmstat.c instead of risking the helper being used in unnecessary contexts. Signed-off-by: Ingo Molnar Signed-off-by: Thomas Gleixner Signed-off-by: Mel Gorman --- mm/vmstat.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/mm/vmstat.c b/mm/vmstat.c index b0534e068166..2c7e7569a453 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -319,6 +319,16 @@ void __mod_zone_page_state(struct zone *zone, enum zone_stat_item item, long x; long t; + /* + * Accurate vmstat updates require a RMW. On !PREEMPT_RT kernels, + * atomicity is provided by IRQs being disabled -- either explicitly + * or via local_lock_irq. On PREEMPT_RT, local_lock_irq only disables + * CPU migrations and preemption potentially corrupts a counter so + * disable preemption. + */ + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_disable(); + x = delta + __this_cpu_read(*p); t = __this_cpu_read(pcp->stat_threshold); @@ -328,6 +338,9 @@ void __mod_zone_page_state(struct zone *zone, enum zone_stat_item item, x = 0; } __this_cpu_write(*p, x); + + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_enable(); } EXPORT_SYMBOL(__mod_zone_page_state); @@ -350,6 +363,10 @@ void __mod_node_page_state(struct pglist_data *pgdat, enum node_stat_item item, delta >>= PAGE_SHIFT; } + /* See __mod_node_page_state */ + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_disable(); + x = delta + __this_cpu_read(*p); t = __this_cpu_read(pcp->stat_threshold); @@ -359,6 +376,9 @@ void __mod_node_page_state(struct pglist_data *pgdat, enum node_stat_item item, x = 0; } __this_cpu_write(*p, x); + + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_enable(); } EXPORT_SYMBOL(__mod_node_page_state); @@ -391,6 +411,10 @@ void __inc_zone_state(struct zone *zone, enum zone_stat_item item) s8 __percpu *p = pcp->vm_stat_diff + item; s8 v, t; + /* See __mod_node_page_state */ + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_disable(); + v = __this_cpu_inc_return(*p); t = __this_cpu_read(pcp->stat_threshold); if (unlikely(v > t)) { @@ -399,6 +423,9 @@ void __inc_zone_state(struct zone *zone, enum zone_stat_item item) zone_page_state_add(v + overstep, zone, item); __this_cpu_write(*p, -overstep); } + + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_enable(); } void __inc_node_state(struct pglist_data *pgdat, enum node_stat_item item) @@ -409,6 +436,10 @@ void __inc_node_state(struct pglist_data *pgdat, enum node_stat_item item) VM_WARN_ON_ONCE(vmstat_item_in_bytes(item)); + /* See __mod_node_page_state */ + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_disable(); + v = __this_cpu_inc_return(*p); t = __this_cpu_read(pcp->stat_threshold); if (unlikely(v > t)) { @@ -417,6 +448,9 @@ void __inc_node_state(struct pglist_data *pgdat, enum node_stat_item item) node_page_state_add(v + overstep, pgdat, item); __this_cpu_write(*p, -overstep); } + + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_enable(); } void __inc_zone_page_state(struct page *page, enum zone_stat_item item) @@ -437,6 +471,10 @@ void __dec_zone_state(struct zone *zone, enum zone_stat_item item) s8 __percpu *p = pcp->vm_stat_diff + item; s8 v, t; + /* See __mod_node_page_state */ + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_disable(); + v = __this_cpu_dec_return(*p); t = __this_cpu_read(pcp->stat_threshold); if (unlikely(v < - t)) { @@ -445,6 +483,9 @@ void __dec_zone_state(struct zone *zone, enum zone_stat_item item) zone_page_state_add(v - overstep, zone, item); __this_cpu_write(*p, overstep); } + + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_enable(); } void __dec_node_state(struct pglist_data *pgdat, enum node_stat_item item) @@ -455,6 +496,10 @@ void __dec_node_state(struct pglist_data *pgdat, enum node_stat_item item) VM_WARN_ON_ONCE(vmstat_item_in_bytes(item)); + /* See __mod_node_page_state */ + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_disable(); + v = __this_cpu_dec_return(*p); t = __this_cpu_read(pcp->stat_threshold); if (unlikely(v < - t)) { @@ -463,6 +508,9 @@ void __dec_node_state(struct pglist_data *pgdat, enum node_stat_item item) node_page_state_add(v - overstep, pgdat, item); __this_cpu_write(*p, overstep); } + + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_enable(); } void __dec_zone_page_state(struct page *page, enum zone_stat_item item) -- 2.31.1