Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp72101pxb; Tue, 17 Aug 2021 19:32:18 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy9mB8BAtf2JQJ00Mlbh/kdziA6PDc4ie67qKQYAuQcRPcbAhRq7etaANzd6dOclQPvjS0m X-Received: by 2002:a05:6602:24d9:: with SMTP id h25mr5394490ioe.11.1629253938261; Tue, 17 Aug 2021 19:32:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1629253938; cv=none; d=google.com; s=arc-20160816; b=js3x0oNO0VPLUS37iAKhgvrLT/pb5JcxZCXgFAEQXlxNF48NKu5hDr/xsbafCSXqNC IJPL7JPpD2IGGt+93meFEjqs5zvbDjJfWMp1VUk67HMIqNVsLcL8coHfn8jhSoO4SHG3 a7i/wgoJaazbOtMBdtzsFoA+7KfbriPPcjXCYZT3qoEJPkNeBRzjKS9WiKI6EZ3shMjT GQOxdGpACIKKd7h1+MNmvf0FDDyl76TOJcTD3t+sKxdwR0DxL0AQIyVnQk+rie+eHriX /tLOHKwxqt89ZkEf2dIm/5B+4tKmY8BwrLTq6qYxkfAfuqmqP50q24kdp8mmlgCSFXHJ eexA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=v9oS43W/HJmLhOZPkTz0FQi15HKnpAmMXp+TxvX+iHk=; b=s7jkNn9Q1AD3KHAcRtd+MIRxC5G7p1hfg46CIsODaKGyWIQ3J8K8ErbnB9kU8fdYDP la/ZqJJOKvETTr8thn5oS3pl8rAFHrFkWTjPtB/s6Xcp5y0N2v/6K4BEqlyavIzWtphL 2wG5xVlhw9p2px5ZsgusRdDDlKbqkD3498AXsHYZOJkBkrmKR/4FFLZFNe+ryk4afQ9K tLKFYPbaWmUGmxcKVqOLJaYnTjU/QR8r95wslcbOKHSrXz+fWEBwBq06NeKZrivrHQqU P1bsCnqVFhtqR1eE4ObNWI9n/1lg0y7V7D6hOWO4OoxgQxLdm1kB+udJiZB5TmnsCNBd BXvQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n5si4579953ilj.12.2021.08.17.19.32.05; Tue, 17 Aug 2021 19:32:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236664AbhHRCaq (ORCPT + 99 others); Tue, 17 Aug 2021 22:30:46 -0400 Received: from mga14.intel.com ([192.55.52.115]:28886 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235486AbhHRCao (ORCPT ); Tue, 17 Aug 2021 22:30:44 -0400 X-IronPort-AV: E=McAfee;i="6200,9189,10079"; a="215967574" X-IronPort-AV: E=Sophos;i="5.84,330,1620716400"; d="scan'208";a="215967574" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Aug 2021 19:30:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.84,330,1620716400"; d="scan'208";a="488263669" Received: from shbuild999.sh.intel.com (HELO localhost) ([10.239.146.151]) by fmsmga008.fm.intel.com with ESMTP; 17 Aug 2021 19:30:05 -0700 Date: Wed, 18 Aug 2021 10:30:04 +0800 From: Feng Tang To: Michal Koutn?? Cc: Johannes Weiner , Linus Torvalds , kernel test robot , Roman Gushchin , Michal Hocko , Shakeel Butt , Balbir Singh , Tejun Heo , Andrew Morton , LKML , lkp@lists.01.org, kernel test robot , "Huang, Ying" , Zhengjun Xing , andi.kleen@intel.com Subject: Re: [mm] 2d146aa3aa: vm-scalability.throughput -36.4% regression Message-ID: <20210818023004.GA17956@shbuild999.sh.intel.com> References: <20210811031734.GA5193@xsang-OptiPlex-9020> <20210812031910.GA63920@shbuild999.sh.intel.com> <20210816032855.GB72770@shbuild999.sh.intel.com> <20210817024500.GC72770@shbuild999.sh.intel.com> <20210817164737.GA23342@blackbody.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210817164737.GA23342@blackbody.suse.cz> User-Agent: Mutt/1.5.24 (2015-08-30) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Michal, On Tue, Aug 17, 2021 at 06:47:37PM +0200, Michal Koutn?? wrote: > On Tue, Aug 17, 2021 at 10:45:00AM +0800, Feng Tang wrote: > > Initially from the perf-c2c data, the in-cacheline hotspots are only > > 0x0, and 0x10, and if we extends to 2 cachelines, there is one more > > offset 0x54 (css.flags), but still I can't figure out which member > > inside the 128 bytes range is written frequenty. > > Is it certain that perf-c2c reported offsets are the cacheline of the > first bytes of struct cgroup_subsys_state? (Yeah, it looks to me so, > given what code accesses those and your padding fixing it. I'm just > raising it in case there was anything non-obvious.) Thanks for checking. Yes, they are. 'struct cgroup_subsys_state' is the first member of 'mem_cgoup' whose address are alwasy cacheline aligned (debug info shows it's even 2KB or 4KB aligned) > > > > /* pah info for cgroup_subsys_state */ > > struct cgroup_subsys_state { > > struct cgroup * cgroup; /* 0 8 */ > > struct cgroup_subsys * ss; /* 8 8 */ > > struct percpu_ref refcnt; /* 16 16 */ > > struct list_head sibling; /* 32 16 */ > > struct list_head children; /* 48 16 */ > > /* --- cacheline 1 boundary (64 bytes) --- */ > > struct list_head rstat_css_node; /* 64 16 */ > > int id; /* 80 4 */ > > unsigned int flags; /* 84 4 */ > > u64 serial_nr; /* 88 8 */ > > atomic_t online_cnt; /* 96 4 */ > > > > /* XXX 4 bytes hole, try to pack */ > > > > struct work_struct destroy_work; /* 104 32 */ > > /* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */ > > > > Since the test run implies this is cacheline related, and I'm not very > > familiar with the mem_cgroup code, the original perf-c2c log is attached > > which may give more hints. > > As noted by Johannes, even in atomic mode, the refcnt would have the > atomic part elsewhere. The other members shouldn't be written frequently > unless there are some intense modifications of the cgroup tree in > parallel. > Does the benchmark create lots of memory cgroups in such a fashion? As Shakeel also mentioned, this 0day's vm-scalability doesn't involve any explicit mem_cgroup configurations. And it's running on a simplified debian 10 rootfs which has some systemd boottime cgroup setup. Thanks, Feng > Regards, > Michal