Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp3538920pxf; Mon, 29 Mar 2021 05:10:59 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzA7cQJhpEHMXB0PEQiVt3FbvzgfpZF+tZAJqSLKYes/gbD4iNztjabfuRLAlMfudR3JN6z X-Received: by 2002:a05:6402:4241:: with SMTP id g1mr28601555edb.331.1617019859693; Mon, 29 Mar 2021 05:10:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617019859; cv=none; d=google.com; s=arc-20160816; b=0RXa9f7y987LYFgRZHSxCcjbFlulQjeap9iZchuBgNIhDr9Ww2WUDrCedWEjAXSqy2 5frIfkf2x2Ahf7qj/LygOiU/V2vN/ibM6JYIcPblm2ABGdxQUcN1GaEqZFyQZmJRGKxF ABxG/gZHYa07tXg7gpDpCVw6Yb/cTMqE0HGaAfKrZHQ+p03o+b3NUDlOb8i7z2xdhBtf ca+1v8hc4j7ZGQ1Z8WjDOkndp+W2lS82ugbRpZVh3p28zABtMTPSaLbPXLRBx1n/O+Ty 77bg76HenmxecEyVwsDGC/fmskx8gdrsk2lNqnOuvDTXAJqc/CjHwNRprQC9Am6+b04x pKUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=a6cSrTm9+neGN2mjQ9cyoc5oLmxCzccMbZh4VDyRz7Y=; b=Cy9rKqf/pH3BO4y1GJvKn1uAQ+1qrR06a2zAeZpT69JGRXsNZT+HwcZH+FPz0O/Hgz zeHcHjzoRoDFH5U8nXZDRG/w8q12x6QPPN+QmG/BzH/pEEAZdBii2zMLw0y0tEuep4pF JFTmugur0V8bsmYSPRt4miVoopuiJ6Mwbnq5aRzcZzP+bel5vCCARV3CnfykcYOyfkvT kDYugTZeWnPV9+h/JkiYow4YadljGlD8o6jj3V7MhtBwB7ihuF+5JFMHYfSNTe/HaLSR L+uRKmI5Llqm/ST/XJHB4Cjk3hGUq324mRFQzfUCrmi0KOnqjSb7q0B3jRpBgZay7eAP 6E5Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v6si12778077ejk.379.2021.03.29.05.10.36; Mon, 29 Mar 2021 05:10:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231305AbhC2MHN (ORCPT + 99 others); Mon, 29 Mar 2021 08:07:13 -0400 Received: from outbound-smtp30.blacknight.com ([81.17.249.61]:39579 "EHLO outbound-smtp30.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231477AbhC2MHA (ORCPT ); Mon, 29 Mar 2021 08:07:00 -0400 Received: from mail.blacknight.com (pemlinmail01.blacknight.ie [81.17.254.10]) by outbound-smtp30.blacknight.com (Postfix) with ESMTPS id 42182BAA79 for ; Mon, 29 Mar 2021 13:06:59 +0100 (IST) Received: (qmail 17289 invoked from network); 29 Mar 2021 12:06:59 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPA; 29 Mar 2021 12:06:59 -0000 From: Mel Gorman To: Linux-MM Cc: Linux-RT-Users , LKML , Chuck Lever , Jesper Dangaard Brouer , Matthew Wilcox , Mel Gorman Subject: [RFC PATCH 0/6] Use local_lock for pcp protection and reduce stat overhead Date: Mon, 29 Mar 2021 13:06:42 +0100 Message-Id: <20210329120648.19040-1-mgorman@techsingularity.net> X-Mailer: git-send-email 2.26.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This series requires patches in Andrew's tree so the series is also available at git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git mm-percpu-local_lock-v1r15 tldr: Jesper and Chuck, it would be nice to verify if this series helps the allocation rate of the bulk page allocator. RT people, this *partially* addresses some problems PREEMPT_RT has with the page allocator but it needs review. The PCP (per-cpu page allocator in page_alloc.c) share locking requirements with vmstat which is inconvenient and causes some issues. Possibly because of that, the PCP list and vmstat share the same per-cpu space meaning that it's possible that vmstat updates dirty cache lines holding per-cpu lists across CPUs unless padding is used. The series splits that structure and separates the locking. Second, PREEMPT_RT considers the following sequence to be unsafe as documented in Documentation/locking/locktypes.rst local_irq_disable(); spin_lock(&lock); The pcp allocator has this sequence for rmqueue_pcplist (local_irq_save) -> __rmqueue_pcplist -> rmqueue_bulk (spin_lock). This series explicitly separates the locking requirements for the PCP list (local_lock) and stat updates (irqs disabled). Once that is done, the length of time IRQs are disabled can be reduced and in some cases, IRQ disabling can be replaced with preempt_disable. After that, it was very obvious that zone_statistics in particular has way too much overhead and leaves IRQs disabled for longer than necessary. It has perfectly accurate counters requiring IRQs be disabled for parallel RMW sequences when inaccurate ones like vm_events would do. The series makes the NUMA statistics (NUMA_HIT and friends) inaccurate counters that only require preempt be disabled. Finally the bulk page allocator can then do all the stat updates in bulk with IRQs enabled which should improve the efficiency of the bulk page allocator. Technically, this could have been done without the local_lock and vmstat conversion work and the order simply reflects the timing of when different series were implemented. No performance data is included because despite the overhead of the stats, it's within the noise for most workloads but Jesper and Chuck may observe a significant different with the same tests used for the bulk page allocator. The series is more likely to be interesting to the RT folk in terms of slowing getting the PREEMPT tree into mainline. drivers/base/node.c | 18 +-- include/linux/mmzone.h | 29 +++-- include/linux/vmstat.h | 65 ++++++----- mm/mempolicy.c | 2 +- mm/page_alloc.c | 173 ++++++++++++++++------------ mm/vmstat.c | 254 +++++++++++++++-------------------------- 6 files changed, 254 insertions(+), 287 deletions(-) -- 2.26.2