Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp1109303pxb; Wed, 6 Apr 2022 08:57:20 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx5c+6ScJT/nnYZNJE4CUcWZcy9DV7j1/Zxnq4suzgz1p8Ql+T4oxVSyylhw3Qxvfk1CDSg X-Received: by 2002:a17:902:d4c1:b0:154:1273:6ec9 with SMTP id o1-20020a170902d4c100b0015412736ec9mr9258234plg.148.1649260639866; Wed, 06 Apr 2022 08:57:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649260639; cv=none; d=google.com; s=arc-20160816; b=aDXwClOMsnlXFHlCV13DifxYKwTcNujLZgv96XkIdo309vF47D6CjCjnZ2kopYp0P8 X+UCi+F1kObiDCS4FgiPzEbRQDpHaG8c5L4BYq8UZKL2Qg442ThHQ+EBsgW1cMFOCNpW V5BdygNSxwiGnux7H40FVVErRiACZgWbDL85cBZfvQKruMHMkpmJwwBYl4SCOxMPbUkS gTKz8BCZrIVA5Vozx6f9zFqERReFS+6ts0BHVWwFeEVA5C0ZmGlPPMPBFkZ9H2zj4+QU 4TyxOZPhzaZG14jlFYegdOC+Ox/8BGr0RkGKy/C5k3kYobYRvfiTErvLjLhYZZTXs5iW FrqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:references :cc:to:from:content-language:subject:user-agent:mime-version:date :message-id; bh=rd6FC8NLN/FzF2+5aFQ8D+svlUOuGgnhWgbgBovzbwY=; b=Cnfbrkvw/hqKpikVjmrOBh1BRB3fTDv136CgpChWMWL2yHYX64Ya13s6RHm5AI02n3 FgwcZK6alL6dz2Ypp01I3W5nLecStQx2LtBiNsXz4XcF9zg6Y+MY/+ArLLr5ukrrCREx OmAKAfi6C00iQjhM/oy/PuDXi7XGAaBJ90HSBfgK+1vPoJoPgk50WZFmHBLxr30a+PZb vwn0L/rZmkGJYY5KMYxrLFSJ9iO0gkE2H5kOj/51YJKjrHqtBvwK7pi565nHlU0NSpu2 J2TCsOCiBCCIIY4OeTFxDKrkpuhrLChZDS9mKo93lvNyHJ6rzAH/fv//wT4cxfiW+6vl yfVA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id q2-20020a63ae02000000b0039895949d28si14804924pgf.575.2022.04.06.08.57.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Apr 2022 08:57:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id CD1C42EF855; Wed, 6 Apr 2022 07:55:17 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235372AbiDFO5J (ORCPT + 99 others); Wed, 6 Apr 2022 10:57:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48164 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235309AbiDFO4r (ORCPT ); Wed, 6 Apr 2022 10:56:47 -0400 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0B7D91DFDD3 for ; Tue, 5 Apr 2022 19:47:26 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R121e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04423;MF=dtcccc@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0V9Ja9xz_1649213242; Received: from 30.39.96.171(mailfrom:dtcccc@linux.alibaba.com fp:SMTPD_---0V9Ja9xz_1649213242) by smtp.aliyun-inc.com(127.0.0.1); Wed, 06 Apr 2022 10:47:23 +0800 Message-ID: <802b944d-d168-d9c8-add3-1fe17f3985f5@linux.alibaba.com> Date: Wed, 6 Apr 2022 10:47:22 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: [RFC PATCH] sched: avoid unnecessary atomic_read when sync_core_before_usermode() is empty Content-Language: en-US From: Tianchen Ding To: Andrew Morton , Thomas Gleixner , Fenghua Yu , Borislav Petkov , Pavel Tatashin , NeilBrown , Vasily Averin , "Matthew Wilcox (Oracle)" Cc: linux-kernel@vger.kernel.org References: <20220402030822.11441-1-dtcccc@linux.alibaba.com> In-Reply-To: <20220402030822.11441-1-dtcccc@linux.alibaba.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,HK_RANDOM_FROM,MAILING_LIST_MULTI, NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org We've run schbench and found wakeup latency on some arm64 machines worse than others. Perf shows there's a hotspot on atomic_read(&mm->membarrier_state); We're still working for the real reason behind it (maybe cache or sth hardware related), and we do see remove this function can help improve performance. Thanks. On 2022/4/2 11:08, Tianchen Ding wrote: > On archs except x86, CONFIG_ARCH_HAS_SYNC_CORE_BEFORE_USERMODE is not > defined. We found membarrier_mm_sync_core_before_usermode() looks like > this when compiled by gcc10: > > if (current->mm != mm) > return; > atomic_read(&mm->membarrier_state); > > This memory access is unnecessary. Remove it to improve performance. > > Signed-off-by: Tianchen Ding > --- > include/linux/sched/mm.h | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h > index a80356e9dc69..3ded68d9f913 100644 > --- a/include/linux/sched/mm.h > +++ b/include/linux/sched/mm.h > @@ -401,6 +401,7 @@ enum { > #include > #endif > > +#ifdef CONFIG_ARCH_HAS_SYNC_CORE_BEFORE_USERMODE > static inline void membarrier_mm_sync_core_before_usermode(struct mm_struct *mm) > { > if (current->mm != mm) > @@ -410,6 +411,11 @@ static inline void membarrier_mm_sync_core_before_usermode(struct mm_struct *mm) > return; > sync_core_before_usermode(); > } > +#else > +static inline void membarrier_mm_sync_core_before_usermode(struct mm_struct *mm) > +{ > +} > +#endif > > extern void membarrier_exec_mmap(struct mm_struct *mm); >