Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp3088961rwd; Fri, 16 Jun 2023 12:03:55 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4oOietEw4jSAsT8ee2jJ8RQNuzNCPjRphwHbCgWssBdzFJTCnj+Ws4/5utPEjzSDa/embv X-Received: by 2002:a92:c6c4:0:b0:33b:c22:d79d with SMTP id v4-20020a92c6c4000000b0033b0c22d79dmr181081ilm.6.1686942235159; Fri, 16 Jun 2023 12:03:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686942235; cv=none; d=google.com; s=arc-20160816; b=ydPHT88D5464cOlFulGImC5eTIdTyEHxs5fX/nGL48y/t3JWBIQsayQqgu3haTFKhq CsOoW9K8hkR8MDRkUfV71SvGc0dEwYmvdFE3IIxlSEManLBTKfiAaj+oeFCy5cDXRICg lS8uAIiwA0w7VPayD5jsO86/BS6Xl0SRP3qO26/AxzxHYHhIYwIGV+Kbpn9knVQ0nQN0 jR7sbtr9Q3UkVVgiWjbb9n3aYEFClK5Q95HBakZO2PT2VqigbJfmMOcOWWRUu2DiGueC NWj+9v8vN1zAJfCKtX4zvwrabO9k/a3wBlMvN9p9hAdVllEQ8qzPqcFXxEXp6IXl/vw/ qINg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=SRS/8duwiKKDMkKT+fOdFDCE0eigCN+x4lELtPgaro4=; b=SgCxQezwGcuy4AdG9nqdpp2NTq93dGicjoRk52vLQgm26VaexRJVouLZRUNzeanGX8 JAN2+cikomXgEskYRn5A4YTy+TlD3czuRvn6eY8KOuRNQV1N/ZJPxXc2LfAZ3TzQFcqp f+T+gT1r0m8rWE+RmE/bn8KrYNyAthfObnGipqx1Cj+E6g2DbdVcl2Xa+uXTj1YTNa47 wDHNwM/A7X9LndaLQG0s5Y06Hpx5N3zeO/PyJ4SQ9/C9VAQx9UCsSsG+/9DyQm4PjNUR 9P4o89S6nlTjkF9rNudjzscT87G+22By5cpHze09+r4n6phSzId2Kxk5SI7al4VWflWk mlOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@efficios.com header.s=smtpout1 header.b=Ypp9IPZE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n5-20020a637205000000b005533c53f40esi1473847pgc.69.2023.06.16.12.03.41; Fri, 16 Jun 2023 12:03:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@efficios.com header.s=smtpout1 header.b=Ypp9IPZE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345156AbjFPSyR (ORCPT + 99 others); Fri, 16 Jun 2023 14:54:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34742 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232710AbjFPSyO (ORCPT ); Fri, 16 Jun 2023 14:54:14 -0400 Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0B1643C0E for ; Fri, 16 Jun 2023 11:53:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1686941631; bh=VNT5xnOEGjWm0oQ2Ve65fcX3U8t0E0DrHtFvp2I4rpU=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=Ypp9IPZEVGqp4u7FDGXa2iHYNous6G9/mLXHipj0tNQjDggg/VmIUC4LTKUcqGMf2 kkpWMzYOwKRSFYVzoUN+f4OE1uAjl8d4HpQoG7qoc8ad//nrlMRAiA4PVAqs8hQjr7 6X8OGs0fQDtILv3sV/asT7eT87w1H729KzAZe0GR8v6r3Qoe/Rzk688/d5l/RBn00c ER5yXpaGO9/3P5yh7jrT2GgBu8NGG0XyCCdpnTJ6s62Tj90r8ZiDnhBScDS/5yZqDz ji3jIaFW/ekkUSFtbc+bsbi3V/uoqjDv3vz+Pw2jWCLNj/n5iCUURIfyR/YmSWol70 Z+2OXTiu7gPXg== Received: from [172.16.0.134] (192-222-143-198.qc.cable.ebox.net [192.222.143.198]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4QjSxC0KyPz18gY; Fri, 16 Jun 2023 14:53:51 -0400 (EDT) Message-ID: <313c3bea-d710-a769-8cb7-0964614425a2@efficios.com> Date: Fri, 16 Jun 2023 14:54:09 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH] mm: Move mm_count into its own cache line Content-Language: en-US To: Andrew Morton Cc: linux-kernel@vger.kernel.org, kernel test robot , Aaron Lu , Olivier Dion , michael.christie@oracle.com, Feng Tang , John Hubbard , Jason Gunthorpe , Peter Xu , linux-mm@kvack.org, Peter Zijlstra References: <20230515143536.114960-1-mathieu.desnoyers@efficios.com> From: Mathieu Desnoyers In-Reply-To: <20230515143536.114960-1-mathieu.desnoyers@efficios.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/15/23 10:35, Mathieu Desnoyers wrote: > The mm_struct mm_count field is frequently updated by mmgrab/mmdrop > performed by context switch. This causes false-sharing for surrounding > mm_struct fields which are read-mostly. Hi Andrew, Given that this patch touches mm code, I think it should go through your tree. Nobody has voiced any objection for a month, Aaron Lu gave his Reviewed-by tag [1], and it solves measurable performance regressions. Would you be willing to pick it up ? Please let me know if something else is needed. Thanks! Mathieu [1] https://lore.kernel.org/lkml/20230516044050.GA315678@ziqianlu-desk2 > > This has been observed on a 2sockets/112core/224cpu Intel Sapphire > Rapids server running hackbench, and by the kernel test robot > will-it-scale testcase. > > Move the mm_count field into its own cache line to prevent false-sharing > with other mm_struct fields. > > Move mm_count to the first field of mm_struct to minimize the amount of > padding required: rather than adding padding before and after the > mm_count field, padding is only added after mm_count. > > Note that I noticed this odd comment in mm_struct: > > commit 2e3025434a6b ("mm: relocate 'write_protect_seq' in struct mm_struct") > > /* > * With some kernel config, the current mmap_lock's offset > * inside 'mm_struct' is at 0x120, which is very optimal, as > * its two hot fields 'count' and 'owner' sit in 2 different > * cachelines, and when mmap_lock is highly contended, both > * of the 2 fields will be accessed frequently, current layout > * will help to reduce cache bouncing. > * > * So please be careful with adding new fields before > * mmap_lock, which can easily push the 2 fields into one > * cacheline. > */ > struct rw_semaphore mmap_lock; > > This comment is rather odd for a few reasons: > > - It requires addition/removal of mm_struct fields to carefully consider > field alignment of _other_ fields, > - It expresses the wish to keep an "optimal" alignment for a specific > kernel config. > > I suspect that the author of this comment may want to revisit this topic > and perhaps introduce a split-struct approach for struct rw_semaphore, > if the need is to place various fields of this structure in different > cache lines. > > Fixes: 223baf9d17f2 ("sched: Fix performance regression introduced by mm_cid") > Fixes: af7f588d8f73 ("sched: Introduce per-memory-map concurrency ID") > Link: https://lore.kernel.org/lkml/7a0c1db1-103d-d518-ed96-1584a28fbf32@efficios.com > Reported-by: kernel test robot > Link: https://lore.kernel.org/oe-lkp/202305151017.27581d75-yujie.liu@intel.com > Signed-off-by: Mathieu Desnoyers > Cc: Peter Zijlstra > Cc: Aaron Lu > Cc: Olivier Dion > Cc: michael.christie@oracle.com > Cc: Andrew Morton > Cc: Feng Tang > Cc: John Hubbard > Cc: Jason Gunthorpe > Cc: Peter Xu > Cc: linux-mm@kvack.org > --- > include/linux/mm_types.h | 23 +++++++++++++++-------- > 1 file changed, 15 insertions(+), 8 deletions(-) > > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index 306a3d1a0fa6..de10fc797c8e 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -583,6 +583,21 @@ struct mm_cid { > struct kioctx_table; > struct mm_struct { > struct { > + /* > + * Fields which are often written to are placed in a separate > + * cache line. > + */ > + struct { > + /** > + * @mm_count: The number of references to &struct > + * mm_struct (@mm_users count as 1). > + * > + * Use mmgrab()/mmdrop() to modify. When this drops to > + * 0, the &struct mm_struct is freed. > + */ > + atomic_t mm_count; > + } ____cacheline_aligned_in_smp; > + > struct maple_tree mm_mt; > #ifdef CONFIG_MMU > unsigned long (*get_unmapped_area) (struct file *filp, > @@ -620,14 +635,6 @@ struct mm_struct { > */ > atomic_t mm_users; > > - /** > - * @mm_count: The number of references to &struct mm_struct > - * (@mm_users count as 1). > - * > - * Use mmgrab()/mmdrop() to modify. When this drops to 0, the > - * &struct mm_struct is freed. > - */ > - atomic_t mm_count; > #ifdef CONFIG_SCHED_MM_CID > /** > * @pcpu_cid: Per-cpu current cid. -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com