Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp3895930rwl; Sun, 2 Apr 2023 18:13:18 -0700 (PDT) X-Google-Smtp-Source: AKy350YySp1N01wDDwT+fVrbAyhY76TgqVsVCHl+R7aRlj6xbF32mEPVFhH/wyvhjUTNfkeih+UI X-Received: by 2002:aa7:8f37:0:b0:5a8:4861:af7d with SMTP id y23-20020aa78f37000000b005a84861af7dmr35490521pfr.20.1680484398282; Sun, 02 Apr 2023 18:13:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680484398; cv=none; d=google.com; s=arc-20160816; b=R8f7M4d1KsK6EJQgz2XdgBYUoVoExtd704SqIT4mUgEoiSGXOPg3NrtsYcYYwx2YOC IevhSFcZAh71EQRcyuKbMPezqAdYdqS39Q57LGMnVhcDZvOUSOgjXbTWrfbxMnOhGAF5 svnsEV3mH4a7ssRRKi+n83g9Epq+QtDb0X0FV6fcW2ADVt1OUXrdasttxbMPTS/5Kcan KUnd4f+vfTNTSyo1rP8fpJO+Tle423QHiathtr0U0gPYEC4qvG03FsXtZgX3YHA98Gwq QiBprDjBECx+3x/WJUfnO3UHohtgGD5Gk3QJ12XddyZ7sObqulhJRAmbQf8IOy07Mz1s Vv2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=x3YOZ83UsvgWlFs0Kx2LWPa7ILjxaeWK3M3mpDocK/k=; b=FKhBC3w7hlQlAqIVbqjVYhujCdjPqiBAIskNY2tF8JHwetjQfhv0P4mZzgvKiTRXxS Nznh1sM+mZyUw8EV6yrmpo4cEhoent+Ups3W2gPj0I2BWAjk90RkiIOLnNHvBxV3s1Nz S3psDKEdgW2QWic0RWezpwzysoYqkmvRccLdA3k3SpNF4YHVRGwe+5DuMJvxmMNZvN2m 9zo008R0Y28nzYckXpvxmioIk5WJ/w7EDjTrpU5ai1x6ANxKI3z/VfKMCfuWPBVhoiWG kzKU8gqUzOZWn4L0FMKyVT0x6GRFNJTIxYuxZ3GKwl0Z0ZUq7fxsOu6QB425oDBN0hEh hX+g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=QrSILPj4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i189-20020a6254c6000000b0062dabed8063si7399602pfb.337.2023.04.02.18.13.00; Sun, 02 Apr 2023 18:13:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=QrSILPj4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230330AbjDCBE4 (ORCPT + 99 others); Sun, 2 Apr 2023 21:04:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37136 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229725AbjDCBEy (ORCPT ); Sun, 2 Apr 2023 21:04:54 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9ACD09ED7 for ; Sun, 2 Apr 2023 18:04:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680483846; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=x3YOZ83UsvgWlFs0Kx2LWPa7ILjxaeWK3M3mpDocK/k=; b=QrSILPj4Za7vnubTTULMFaCG0JsqGGopnxt7Z3gkzpjxjLAT0R8YwMjrfT/cEG1z1teTIr oZldMV7rDw2+qo2DRm+9kBO7uf73WXPxKkvPMBAxuAVBVAGrX29FUhSMTvyyx4bQpV9Ngp rsby9dBdQ6AeKtYvXANWGI0K2ZnTfP0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-470-rSOuq3y4MMawcr3VwJ-BHQ-1; Sun, 02 Apr 2023 21:04:05 -0400 X-MC-Unique: rSOuq3y4MMawcr3VwJ-BHQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id AD4E1802D1A; Mon, 3 Apr 2023 01:04:04 +0000 (UTC) Received: from [10.22.8.120] (unknown [10.22.8.120]) by smtp.corp.redhat.com (Postfix) with ESMTP id 704592027040; Mon, 3 Apr 2023 01:04:04 +0000 (UTC) Message-ID: <293db107-a572-592f-cc27-e59ab81a4e60@redhat.com> Date: Sun, 2 Apr 2023 21:04:04 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 Subject: Re: A couple of TSC questions Content-Language: en-US To: paulmck@kernel.org, Feng Tang Cc: Thomas Gleixner , linux-kernel@vger.kernel.org References: <3daa086c-b4a0-47a9-8bfc-aac4139013c4@paulmck-laptop> From: Waiman Long In-Reply-To: <3daa086c-b4a0-47a9-8bfc-aac4139013c4@paulmck-laptop> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Spam-Status: No, score=-2.6 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 3/31/23 13:16, Paul E. McKenney wrote: > On Tue, Mar 28, 2023 at 02:58:54PM -0700, Paul E. McKenney wrote: >> On Mon, Mar 27, 2023 at 10:19:54AM +0800, Feng Tang wrote: >>> On Fri, Mar 24, 2023 at 05:47:33PM -0700, Paul E. McKenney wrote: >>>> On Wed, Mar 22, 2023 at 01:14:48PM +0800, Feng Tang wrote: > [ . . . ] > >>>>>> Second, we are very occasionally running into console messages like this: >>>>>> >>>>>> Measured 2 cycles TSC warp between CPUs, turning off TSC clock. >>>>>> >>>>>> This comes from check_tsc_sync_source() and indicates that one CPU's >>>>>> TSC read produced a later time than a later read from some other CPU. >>>>>> I am beginning to suspect that these can be caused by unscheduled delays >>>>>> in the TSC synchronization code, but figured I should ask you if you have >>>>>> ever seen these. And of course, if so, what the usual causes might be. >>>>> I haven't seen this error myself or got similar reports. Usually it >>>>> should be easy to detect once happened, as falling back to HPET >>>>> will trigger obvious performance degradation. >>>> And that is exactly what happened. ;-) >>>> >>>>> Could you give more detail about when and how it happens, and the >>>>> HW info like how many sockets the platform has. >>>> We are in early days, so I am checking for other experiences. >>>> >>>>> CC Thomas, Waiman, as they discussed simliar case here: >>>>> https://lore.kernel.org/lkml/87h76ew3sb.ffs@tglx/T/#md4d0a88fb708391654e78312ffa75b481690699f >>>> Fun! ;-) >> Waiman, do you recall what fraction of the benefit was provided by the >> first patch, that is, the one that grouped the sync_lock, last_tsc, >> max_warp, nr_warps, and random_warps global variables into a single >> struct? The purpose of the first patch is just to avoid false cacheline sharing between the watchdog cpu and another cpu that happens to access a nearby data in the same cacheline. Now I realize that I should have followed up with this patch series. The problem reported in that patch series happen on one system only, I believe. > And what we are seeing is unlikely to be due to cache-latency-induced > delays. We see a very precise warp, for example, one system always > has 182 cycles of TSC warp, another 273 cycles, and a third 469 cycles. > Another is at the insanely large value of about 2^64/10, and shows some > variation, but that variation is only about 0.1%. > > But any given system only sees warp on about half of its reboots. > Perhaps due to the automation sometimes power cycling? > > There are few enough affected systems that investigation will take > some time. Maybe the difference in wrap is due to NUMA distance of the running cpu from the node where the data reside. It will be interesting to see if my patch helps. Cheers, Longman