Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp2114606pxp; Fri, 18 Mar 2022 03:40:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxfdtpAk6RgmvwpWxu/OGB57d+dIGGhuT8+A1pqnRC/f/+3724fix04pQvQGEKY921Awdlk X-Received: by 2002:aa7:cb8b:0:b0:410:9aaf:2974 with SMTP id r11-20020aa7cb8b000000b004109aaf2974mr8770320edt.173.1647600034883; Fri, 18 Mar 2022 03:40:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1647600034; cv=none; d=google.com; s=arc-20160816; b=p2de2pP5ztNflP3TV8EaVjGi3BiMY6tcCkDEpJrNCoJCj8MMVaIR+5KTMZpcRT5AUf wIaOBKqi4T0t9Mj2x6yL5gOQQvnIFe5Od5D4y76uLdRt8Eanzx+1fJSGAaWzFpo0CkJ5 l4HSMwP2e6hxVQGwli7MR2cyPca6jveIz/cDSsscaCRAOzmtddK7PyaI60h0pZLmGyDx ML2GMWad1m6PkFnXFlh+kPW6jYuUr7CygLKIkpYxv1qAPNYnqlbJ115Gofj1O9np08Ys j8lBhsC0HZBkonmgAHWfHtukIwerZ3djwmQS3YOFwdDX5JTwR0tzlDNFPpXIA+1O+CsZ gLaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:subject :from:references:cc:to:content-language:user-agent:mime-version:date :message-id:dkim-signature; bh=ihDJourmDyfG4SCl2aU6wIjzkl2DvQeYhPnId0j/do0=; b=rAiivpS/IcshYXeSulH8YyTN+2psthzgYW8rQlSlsEokH/PNELXLpXgulolOgi7ZPv cbBjF14C2F7K51F/sMMxpoGLypdwAMUYcnIsWYwtTDkBbj7cyE2sBuRsVRMtEG2XA0Eu 3LG83AurtKf4g4pxjoNuWfg5wghYLl2dZphWwi+TFuJqM9K/4jJUm7i1H0j9px6rhzoh T5Bt/4rnK9LrU3zJPARKup+kK31clU5BlfT6pFtXe6+pf5Yf4UVSL0CQwChKbTGZuM6q 9fkiFe3fCv5c/sta8xMc3ZUEJFOnOMlDFvZNX8QvvW4KslaBdai9FC1aiEkC4njl+BRn u58A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=fULJiA+Q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e28-20020a170906649c00b006df76385ca1si1150768ejm.321.2022.03.18.03.40.08; Fri, 18 Mar 2022 03:40:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=fULJiA+Q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231223AbiCRArI (ORCPT + 99 others); Thu, 17 Mar 2022 20:47:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60356 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229482AbiCRArH (ORCPT ); Thu, 17 Mar 2022 20:47:07 -0400 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E8C06247C2F for ; Thu, 17 Mar 2022 17:45:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1647564349; x=1679100349; h=message-id:date:mime-version:to:cc:references:from: subject:in-reply-to:content-transfer-encoding; bh=ioIK2vITaATAP80hVGQmV1FVy0+JSS1KHG27psb3DZw=; b=fULJiA+Q3zI6eQTJPS9IkXX5SDNrxw4TaEk89fBT+mKGbXpuMryNuZQ3 jHDCBRVVcYojcqRgcqt5lLKEX/KC/pIMURJLkVg5SDdhxpWOu0I/3+XbQ C7MNgnMaQXB3ezITuagCIa0tarxwxa70ZAuRXIakgpqC0QDZrd5Z4Qni5 fwai13RYIDQXTDOD02RxLRB8LkJpRi/JuSLjB0pdWiMA8+5nF9oHQG9Ub 9G+gpTBSCD4u0o36u+G5JAQIw3ivftLkGdrjO9JvKXhW06eqOaAoJYIJR DhminPxYxh1W+0afgvj0HZJ9h0cGTP0DXF6f80j+kdDke5ShbTVWBJuq0 g==; X-IronPort-AV: E=McAfee;i="6200,9189,10289"; a="256964364" X-IronPort-AV: E=Sophos;i="5.90,190,1643702400"; d="scan'208";a="256964364" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Mar 2022 17:45:49 -0700 X-IronPort-AV: E=Sophos;i="5.90,190,1643702400"; d="scan'208";a="516984044" Received: from dstanfie-mobl2.amr.corp.intel.com (HELO [10.212.178.19]) ([10.212.178.19]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Mar 2022 17:45:48 -0700 Message-ID: Date: Thu, 17 Mar 2022 17:45:41 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Content-Language: en-US To: Nadav Amit Cc: kernel test robot , Ingo Molnar , Dave Hansen , LKML , "lkp@lists.01.org" , "lkp@intel.com" , "ying.huang@intel.com" , "feng.tang@intel.com" , "zhengjun.xing@linux.intel.com" , "fengwei.yin@intel.com" , Andy Lutomirski References: <20220317090415.GE735@xsang-OptiPlex-9020> <3B958B13-75F0-4B81-B8CF-99CD140436EB@vmware.com> <96f9b880-876f-bf4d-8eb0-9ae8bbc8df6d@intel.com> From: Dave Hansen Subject: Re: [x86/mm/tlb] 6035152d8e: will-it-scale.per_thread_ops -13.2% regression In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 3/17/22 17:20, Nadav Amit wrote: > I don’t have other data right now. Let me run some measurements later > tonight. I understand your explanation, but I still do not see how > much “later” can the lazy check be that it really matters. Just > strange. These will-it-scale tests are really brutal. They're usually sitting in really tight kernel entry/exit loops. Everything is pounding on kernel locks and bouncing cachelines around like crazy. It might only be a few thousand cycles between two successive kernel entries. Things like the call_single_queue cacheline have to be dragged from other CPUs *and* there are locks that you can spin on. While a thread is doing all this spinning, it is forcing more and more threads into the lazy TLB state. The longer you spin, the more threads have entered the kernel, contended on the mmap_lock and gone idle. Is it really surprising that a loop that can take hundreds of locks can take a long time? for_each_cpu(cpu, cfd->cpumask) { csd_lock(csd); ... }