Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp3357460pxu; Mon, 30 Nov 2020 00:51:19 -0800 (PST) X-Google-Smtp-Source: ABdhPJwzsfwYfsVK8yk5O1yW4MR/bWrSNBat5+fO8pKX0MjjgEDmZ6z26MBpKGt5dZ95POtFrxO0 X-Received: by 2002:aa7:d41a:: with SMTP id z26mr14871920edq.267.1606726278837; Mon, 30 Nov 2020 00:51:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606726278; cv=none; d=google.com; s=arc-20160816; b=ESHMCNfESE6aUyiQi5Jqeym6k2tNpCwjL4EHreQXmf50iT1MBEQQKdAIczgyZRaIVa kuItxODM09YP+Fyu60v7KrTm4YARpeJVwacMFiIDKHFvNTocoQFsRxPYgX/Ovo2Zcg0b 3ErQAnAB/PoPbxxJXnLEq2y5lKGdaW696KkoGJ20TOyjJ/0XsQ9buTUJNJUgUafs165z gGNcakcZ3/7qr4RSDtnCzOk4ZRzGFhQ6KYJfJpMltcB1Dw6C6yMkrXisOazch+l5U3WS CKitAv2bnKbEAoPjbZJpGALxDG+iPntDtwe65SS/W58eU7C3X/a3F2CXnko6gePT25bV RSeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=Ts1SlZ5Kb9agqt+MNolM33QUi9fKSB+A4isf8i7vob4=; b=xRyFackMrtdfEDJQQCXuUGwEG4G2Ut5QwF8q3TSO8T5aZnO3SwLY4YXtvLdOZABQvf rQhY1CdMXTrUsvTd6/lePaj0sAD47zLbPTIOTioPh+KX7TO2a0mepvRKxzIeqfmpuXr6 h07GzlpUCZ+qSArDQBppiYYCzJP5rKo+N1vqo+HfWF/vcMKfoogGDdLynPza0FwIHo/X FAa4JxKvAo9BlrAS3av4a/tcV8qmXIM5FcxmxuT/TP3qGqzeZWEfX6EyKKjSPbR0upRn GixJj95yR3CVFU88W2W0EDpk9uO1Jog6ssoaM7Uu3dd91klTd84nbVH7y4rhigDYw2lb NnpA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=sxT61S95; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s7si7638209edx.227.2020.11.30.00.50.54; Mon, 30 Nov 2020 00:51:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=sxT61S95; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726851AbgK3ItI (ORCPT + 99 others); Mon, 30 Nov 2020 03:49:08 -0500 Received: from mx2.suse.de ([195.135.220.15]:33348 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726578AbgK3ItI (ORCPT ); Mon, 30 Nov 2020 03:49:08 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1606726101; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Ts1SlZ5Kb9agqt+MNolM33QUi9fKSB+A4isf8i7vob4=; b=sxT61S95dX4v0Rxx7yRNohGN8JYykb2qodCV0vNL1NBr1J1bYuqL6h5tZWj4AjgXULtN79 V8DzWHIkfFr6co9IBCU1ym2qKY29LMgbmnxSeS1rfBhj3942plZuKYy+a4E4dRbWDvbNxb XBmDv4/zIjR2DoxoTVcqU7huHVm9AaM= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 935D1AC95; Mon, 30 Nov 2020 08:48:21 +0000 (UTC) Date: Mon, 30 Nov 2020 09:48:20 +0100 From: Michal Hocko To: Feng Tang Cc: Xing Zhengjun , Waiman Long , Linus Torvalds , Andrew Morton , Shakeel Butt , Chris Down , Johannes Weiner , Roman Gushchin , Tejun Heo , Vladimir Davydov , Yafang Shao , LKML , lkp@lists.01.org, lkp@intel.com, zhengjun.xing@intel.com, ying.huang@intel.com, andi.kleen@intel.com Subject: Re: [LKP] Re: [mm/memcg] bd0b230fe1: will-it-scale.per_process_ops -22.7% regression Message-ID: <20201130084820.GB17338@dhcp22.suse.cz> References: <20201102092754.GD22613@dhcp22.suse.cz> <82d73ebb-a31e-4766-35b8-82afa85aa047@intel.com> <20201102100247.GF22613@dhcp22.suse.cz> <20201104081546.GB10052@dhcp22.suse.cz> <20201112122844.GA11000@shbuild999.sh.intel.com> <20201112141654.GC12240@dhcp22.suse.cz> <20201113073436.GA113119@shbuild999.sh.intel.com> <20201120114424.GA103521@shbuild999.sh.intel.com> <20201125062445.GA51005@shbuild999.sh.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201125062445.GA51005@shbuild999.sh.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 25-11-20 14:24:45, Feng Tang wrote: [...] > I think we finally found the trick :), further debugging shows it > is not related to the alignment inside one cacheline, but the > adjacency of 2 adjacent cacheliens (2N and 2N+1, one pair of 128 bytes). > > For structure mem_cgroup, member 'vmstats_local', 'vmstats_percpu' > sit in one cacheline, while 'vmstats[]' sits in the next cacheline, > and when 'adjacent cacheline prefetch" is enabled, if these 2 lines > sit in one pair (128 btyes), say 2N and 2N+1, then there seems to > be some kind of false sharing, and if they sit in 2 pairs, say > 2N-1 and 2N then it's fine. > > And with the following patch to relayout these members, the regression > is restored and event better. while reducing 64 bytes of sizeof > 'struct mem_cgroup' > > parent_commit Waiman's_commit +relayout patch > > result 187K 145K 200K > > Also, if we disable the hw prefetch feature, the Waiman's commit > and its parent commit will have no performance difference. > > Thanks, > Feng > > >From 2e63af34fa4853b2dd9669867c37a3cf07f7a505 Mon Sep 17 00:00:00 2001 > From: Feng Tang > Date: Wed, 25 Nov 2020 13:22:21 +0800 > Subject: [PATCH] mm: memcg: relayout structure mem_cgroup to avoid cache > interfereing > > 0day reported one -22.7% regression for will-it-scale page_fault2 > case [1] on a 4 sockets 144 CPU platform, and bisected to it to be > caused by Waiman's optimization (commit bd0b230fe1) of saving one > 'struct page_counter' space for 'struct mem_cgroup'. > > Initially we thought it was due to the cache alignment change introduced > by the patch, but further debug shows that it is due to some hot data > members ('vmstats_local', 'vmstats_percpu', 'vmstats') sit in 2 adjacent > cacheline (2N and 2N+1 cacheline), and when adjacent cache line prefetch > is enabled, it triggers an "extended level" of cache false sharing for > 2 adjacent cache lines. > > So exchange the 2 member blocks, while keeping mostly the original > cache alignment, which can restore and even enhance the performance, > and save 64 bytes of space for 'struct mem_cgroup' (from 2880 to 2816, > with 0day's default RHEL-8.3 kernel config) > > [1]. https://lore.kernel.org/lkml/20201102091543.GM31092@shao2-debian/ > > Fixes: bd0b230fe145 ("mm/memcg: unify swap and memsw page counters") > Reported-by: kernel test robot > Signed-off-by: Feng Tang Sorry for a late reply. This is indeed surprising! I was really expecting page counter to be the culprit. Anyway this rearrangement looks ok as well. moving_account related stuff is still after padding which is good because this rare operation shouldn't really interfere with the rest of the structure. Btw. now you made me look into the history and I have noticed e81bf9793b18 ("mem_cgroup: make sure moving_account, move_lock_task and stat_cpu in the same cacheline") so this is not the first time we are dealing with a regression here. Linus has already merged the patch but for the record Acked-by: Michal Hocko Thanks a lot for pursuing this! -- Michal Hocko SUSE Labs