Received: by 2002:ac0:e34a:0:0:0:0:0 with SMTP id g10csp106857imn; Mon, 25 Jul 2022 11:09:53 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vLbuvL2bEbc0VDbuCQhQFj5hRESUBlN0lCQNjMgs4gFM5LJng/KdH12K1UI3ee6/47btZt X-Received: by 2002:a05:6402:350b:b0:43a:e18e:dee6 with SMTP id b11-20020a056402350b00b0043ae18edee6mr14622199edd.31.1658772593437; Mon, 25 Jul 2022 11:09:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1658772593; cv=none; d=google.com; s=arc-20160816; b=tSw9NjAUxTKWuMEoh23WZzXQKOJkqaHYkNud+GgSGPkFofTKV8WhZ513BsX5G21zoS YBbpmZ0NFzv3Xn2+aVIbTkoGy3Vc+2d/aa8XIjWbwV6TbNc9tGBTgtD4GRq9T1jF49PK V6Jol8qIdwme273oVqGcUB07hJrn86lXhcd+vW3VeQ5b5EHoFYx76n3ollAcxFBhFnZN W0Tk+lwqhvbPlbPuT/iaq4yfxbuqmEAqHC+rukDSSrvIuYVd4PzF8vEn403+kNCig/J0 3igBVSCmTmf2PnhIMAK4ibaTEQPRlNtBotCTqDk7ze7iFbOwkUQSpoBbvkqtefrWJK7i a+hQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=XADneuAzpFFVhh/4g3t4JG2HYYhDLuTTuISfOwJVU9A=; b=1LovHJSeWjV5PJQl3QyUeqgha12glAG+M3q82Xx8bHHrx1zqeWiz8ym+PUNncEJRb1 9i/umgWBfyx9fDPa9UUEs5MpsS4AN0r7E4zXwlYBrgfJfiVK+u+Yx5b+QJoj2dK4sgZD rLU+bS5akM0TLV2N4zodqmLzbn59+BJ1nyVxXOV1bjo+hw6MxLDBBs5OK3R7asJvzkM/ kt4qSoD87XOp/xfEKPB9NPVjyFcbgv7wc/RpIMDtoFK6/ElfEu7zT0bAu4akTG67T7yB d5oErI6AhVVQxPZpUTvWT+bOETAcgknWtLRJa37hCSqOYj7217b/kdVlyrgPvf/S970e P9HQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BqjN4nns; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x2-20020a05640225c200b0043bd686d636si8793620edb.268.2022.07.25.11.09.20; Mon, 25 Jul 2022 11:09:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BqjN4nns; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232072AbiGYRo0 (ORCPT + 99 others); Mon, 25 Jul 2022 13:44:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47878 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229806AbiGYRoZ (ORCPT ); Mon, 25 Jul 2022 13:44:25 -0400 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB0661A054 for ; Mon, 25 Jul 2022 10:44:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1658771064; x=1690307064; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=SkpttYi1cwno6GFk8tMlBxbrhXWEBQKeBSWYy4Xy/6A=; b=BqjN4nnsI8yhPNoVdd/SqEM7IlCIQ2bOSRY/RsEoZyupyOcu7tQHtfw/ UFldm05FVLYL1m9KLuv6PE9jWsZ0hwL0oy9yaFJXy4Y/1Z0g4riLwQh+I 6fHm8GAIZeDNh2Z/BMFnhMmUlK97ciluS+w06wjixDqRdR57V8bopvZth fG5Rd4BY35R05r0yHutbiALLzy5IHpA4/iOWuBGH25B3lHrBJv4NIJHbt rl3ZB8AR5PfFONAkf8UCD+yORl35sU16LadF7cRvV9C1P7kM85j/21HkO E9jXjxUBGvUXAdN0Cuk5tc33hJKaUASLnFMiJJzHtneHMFXCQF3KTST7X Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10419"; a="268148679" X-IronPort-AV: E=Sophos;i="5.93,193,1654585200"; d="scan'208";a="268148679" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Jul 2022 10:44:24 -0700 X-IronPort-AV: E=Sophos;i="5.93,193,1654585200"; d="scan'208";a="596750191" Received: from spaletti-mobl1.amr.corp.intel.com (HELO [10.212.231.21]) ([10.212.231.21]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Jul 2022 10:44:23 -0700 Message-ID: Date: Mon, 25 Jul 2022 10:44:23 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [PATCH 1/2] x86/fpu: Measure the Latency of XSAVE and XRSTOR Content-Language: en-US To: David Laight , 'Yi Sun' , "linux-kernel@vger.kernel.org" , "x86@kernel.org" Cc: "sohil.mehta@intel.com" , "tony.luck@intel.com" , "heng.su@intel.com" References: <20220723083800.824442-1-yi.sun@intel.com> <20220723083800.824442-2-yi.sun@intel.com> <921078bc2a994d3ab6aba26d4654cb47@AcuMS.aculab.com> From: Dave Hansen In-Reply-To: <921078bc2a994d3ab6aba26d4654cb47@AcuMS.aculab.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/24/22 13:54, David Laight wrote: > I've done some experiments that measure short instruction latencies. > Basically I found: Short? The instructions in question can write up to about 12k of data. That's not "short" by any means. I'm also not sure precision here is all that important. The main things we want to know here when and where the init and modified optimizations are coming into play. In other words, how often is there actual data that *needs* to be saved and restored and can't be optimized away. So, sure, if we were measuring a dozen cycles here, you could make an argument that this _might_ be problematic. But, in this case, we really just want to be able to tell when XSAVE/XRSTOR are getting more or less expensive and also get out a minimal amount of data (RFBM/XINUSE) to make a guess why that might be. Is it *REALLY* worth throwing serializing instructions in and moving clock sources to do that? Is the added precision worth it?