Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp2902674pxb; Tue, 19 Jan 2021 08:44:43 -0800 (PST) X-Google-Smtp-Source: ABdhPJxX4MwHiA/UjQ+QW/EHWSKFMbzlxa1pnJMR4/GvmO7I7sH4+SJdedkvIKdgZH5G6veMAxbf X-Received: by 2002:a17:906:1308:: with SMTP id w8mr3455794ejb.396.1611074682823; Tue, 19 Jan 2021 08:44:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611074682; cv=none; d=google.com; s=arc-20160816; b=x+4w19IbljBgM6+Znmh5ezmKAWUSgLBz96NYqVhY6HksZ4bfqypqKsCL3QS3rtjT/+ /o64mhbeM4EUadmPMjlafPEgJNCOQwYibc8wJ173JQnF7g267/spHs86F+kdlnIvHM59 VmAHVn/Ey86A86LKG1mfugj0Gcu0inJYNHR+UUivlfPJsuxk/zD7ooXAyrhbOEDYjNqj DhhgvKDHzXwtMe8nj7hHDw7oWODLik0VmoNHV/T73tjUzaWHwNDkhqMje281Ofiq/wSr b8lbFYwGRLk5IHM7PpKmSoIs9NDiZkRYUIYHhUPU9WaEPNl5LjE7LIasJsLmKo9ht2rs +poQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=+5KQrnHXiI7OqCuos/sE3GNhfpeVWGLcvYlJ3q/gLNQ=; b=djk7U20JUJEuzXMtpWodYFCQCeSPftcNwLIQtLPyPf2GbrFPxr3M3Nwkr8kHV30548 w37Z+VwXN+u1xpkbwEYbpcXWTXxYL7RYg0xE6dBvwvctzmWboFmIj/j6gCZm+vdix6oA cUlDGg8CAbQryGEJkklH2wdTxQxCDqY2uXKGYLn8V5g+H9bJN5LNNghtnsbUSrmY50Yv c7wfu+wGeoOGQOmXbp9d0T6IjURmq4B/hX/OAU5nE+o3mOD8e96FdJgsAJmc3Z+6zsbI b2RqYAZ/JlRUxGaKoatEgIBHhWpTm6iHgVQtfXcJjDU4uy8rM28rE8PgnJJx3I1JhnRH oFIw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="SAYm/oQ4"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n23si4973572edq.549.2021.01.19.08.44.09; Tue, 19 Jan 2021 08:44:42 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="SAYm/oQ4"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730822AbhASQmJ (ORCPT + 99 others); Tue, 19 Jan 2021 11:42:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52616 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730842AbhASQk2 (ORCPT ); Tue, 19 Jan 2021 11:40:28 -0500 Received: from mail-lf1-x134.google.com (mail-lf1-x134.google.com [IPv6:2a00:1450:4864:20::134]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B3EBC0613ED for ; Tue, 19 Jan 2021 08:39:41 -0800 (PST) Received: by mail-lf1-x134.google.com with SMTP id q12so2932559lfo.12 for ; Tue, 19 Jan 2021 08:39:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=+5KQrnHXiI7OqCuos/sE3GNhfpeVWGLcvYlJ3q/gLNQ=; b=SAYm/oQ4w8PyWqsqATgX21dcMEFisH+jKAqzgYXuu1QVHd/bKa+saZpsKd0Ixp5kaK dtVPudvI5I/U+a75VmTLgGA4JLdIlmVG6G7NDd3i+0QbpglU435+x1/ZhSSPnal6m3Q7 BZokcWSHCd/aVi/+BJGmwVfY5wL8M9nHmddJwC7hyop08FGjrY8jpf2dLj8nn5idDhCj zYUOf7d4hocD4nvwEysBlcl4xz3Cnf/evjf9TU0EZS3EUZvbJhBce3elVCt7M46xBT3i H/4oHiqUdBfqrrghrCRwecGLWnJLddQDP77wCGd+nIltcdeYFxaxQoeGdG9MqGVUh8T1 EnmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=+5KQrnHXiI7OqCuos/sE3GNhfpeVWGLcvYlJ3q/gLNQ=; b=GW9SRZNl0wxvcto0xa3VDe5jQqBSVqgA3UKxE74sMp44KhOE5mVNv0TIw8fC8nZD29 qRYB4WvLWIEI/wn0rFa/ASht1qTT18gz1e78P9dLaXFUZqt4Ak7COzhB2hIZt76mUftb JjIoelSCf8Ar2q1CWvRRvMWxaS8+0OD/slnB7ZDoRu9In/kam3usPW9UgYhRtsI1nZMU 9wZCXuJLWu+UcMNM98uLG0eMV6tVyhRi7rGe92DhwhyTDpxKhRZ9r68Z8D+1zK5vGecc WLmIeztVGo7PjgYariLWk5hw47baw0rcR557UDfuCkTfdJtk7oJfkEIZoLht5JMMswG8 svPQ== X-Gm-Message-State: AOAM531cCEIlXjMMRhCR79myz19xWFfrRzI91Q60M4Kg4oi4uhphnld8 +CdpOt+4n7e71VFaI7WjcXQivWUV4PqGPeWvvTRmOw== X-Received: by 2002:a19:644b:: with SMTP id b11mr2187416lfj.358.1611074379421; Tue, 19 Jan 2021 08:39:39 -0800 (PST) MIME-Version: 1.0 References: <1611040814-33449-1-git-send-email-feng.tang@intel.com> In-Reply-To: <1611040814-33449-1-git-send-email-feng.tang@intel.com> From: Shakeel Butt Date: Tue, 19 Jan 2021 08:39:28 -0800 Message-ID: Subject: Re: [PATCH v2] mm: page_counter: relayout structure to reduce false sharing To: Feng Tang Cc: Andrew Morton , Michal Hocko , Johannes Weiner , Roman Gushchin , Linux MM , LKML Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jan 18, 2021 at 11:20 PM Feng Tang wrote: > > When checking a memory cgroup related performance regression [1], > from the perf c2c profiling data, we found high false sharing for > accessing 'usage' and 'parent'. > > On 64 bit system, the 'usage' and 'parent' are close to each other, > and easy to be in one cacheline (for cacheline size =3D=3D 64+ B). 'usage= ' > is usally written, while 'parent' is usually read as the cgroup's > hierarchical counting nature. > > So move the 'parent' to the end of the structure to make sure they > are in different cache lines. > > Following are some performance data with the patch, against > v5.11-rc1. [ In the data, A means a platform with 2 sockets 48C/96T, > B is a platform of 4 sockests 72C/144T, and if a %stddev will be > shown bigger than 2%, P100/P50 means number of test tasks equals > to 100%/50% of nr_cpu] > > will-it-scale/malloc1 > --------------------- > v5.11-rc1 v5.11-rc1+patch > > A-P100 15782 =C2=B1 2% -0.1% 15765 =C2=B1 3% will-it-s= cale.per_process_ops > A-P50 21511 +8.9% 23432 will-it-scale.per_p= rocess_ops > B-P100 9155 +2.2% 9357 will-it-scale.per_p= rocess_ops > B-P50 10967 +7.1% 11751 =C2=B1 2% will-it-scale.= per_process_ops > > will-it-scale/pagefault2 > ------------------------ > v5.11-rc1 v5.11-rc1+patch > > A-P100 79028 +3.0% 81411 will-it-scale.per_p= rocess_ops > A-P50 183960 =C2=B1 2% +4.4% 192078 =C2=B1 2% will-it-s= cale.per_process_ops > B-P100 85966 +9.9% 94467 =C2=B1 3% will-it-scale.= per_process_ops > B-P50 198195 +9.8% 217526 will-it-scale.per_p= rocess_ops > > fio (4k/1M is block size) > ------------------------- > v5.11-rc1 v5.11-rc1+patch > > A-P50-r-4k 16881 =C2=B1 2% +1.2% 17081 =C2=B1 2% fio.read_= bw_MBps > A-P50-w-4k 3931 +4.5% 4111 =C2=B1 2% fio.write_bw_M= Bps > A-P50-r-1M 15178 -0.2% 15154 fio.read_bw_MBps > A-P50-w-1M 3924 +0.1% 3929 fio.write_bw_MBps > > [1].https://lore.kernel.org/lkml/20201102091543.GM31092@shao2-debian/ > Signed-off-by: Feng Tang > Reviewed-by: Roman Gushchin > Cc: Johannes Weiner > Cc: Michal Hocko Reviewed-by: Shakeel Butt