Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752250AbcDMPhd (ORCPT ); Wed, 13 Apr 2016 11:37:33 -0400 Received: from mail-bl2on0140.outbound.protection.outlook.com ([65.55.169.140]:51426 "EHLO na01-bl2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751194AbcDMPhb (ORCPT ); Wed, 13 Apr 2016 11:37:31 -0400 Authentication-Results: kernel.org; dkim=none (message not signed) header.d=none;kernel.org; dmarc=none action=none header.from=hpe.com; Message-ID: <570E67B1.3000708@hpe.com> Date: Wed, 13 Apr 2016 11:37:21 -0400 From: Waiman Long User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12 MIME-Version: 1.0 To: Ingo Molnar CC: Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , , , Jiang Liu , Borislav Petkov , Andy Lutomirski , Scott J Norton , Douglas Hatch , Randy Wright , Peter Zijlstra Subject: Re: [PATCH v4] x86/hpet: Reduce HPET counter read contention References: <1460486768-34024-1-git-send-email-Waiman.Long@hpe.com> <20160413061813.GB4705@gmail.com> In-Reply-To: <20160413061813.GB4705@gmail.com> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [72.71.243.105] X-ClientProxiedBy: CY1PR21CA0084.namprd21.prod.outlook.com (10.163.250.180) To CS1PR84MB0310.NAMPRD84.PROD.OUTLOOK.COM (10.162.190.28) X-MS-Office365-Filtering-Correlation-Id: d9104a37-aba4-4b78-eb6d-08d363b184d4 X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0310;2:aZf9fd82PiwZaFCs33kLdGqe86lo5DMkrq44HRwY+KDMAz6hfmZIDgQDgQFby1kdIFxHaN9Ntlv2wE+WD+/jwBdQepbg+IWJ0zi/l8PwA+oWZzBupFKusGfoXODc0jygyDXoshLyhXpPyxW9GrY2UlK3rf1GgVIUMWrFws8eQM6TzGVRTZMF6253AZ3qma6R;3:xhN7KN5E/KTIarf+fhHO3t3/3h/os09NwWPAsfUBgXscK6HwIilumVq3tyvCBU6FpjXxZQbe0hW3+DxSe9CDjbtK3T/f9dPXrSVpjgqQG1ky1ndhr8rBixbM1SWqTJCy X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:CS1PR84MB0310; X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0310;25:fVA39t9OmGFJOSaXWraa1M11orX0EH99FK5fFeOos/AR/fbsj7zeuXXQsIpzzTEvAYgTrih74jTtC7ySFHZaQT8R2na0z41O4fTT2p4v7YlT5JgDHc1HhsGvxGfUJm9xCsRqIwMJSXPyXnIKQvFaROijaTclpygFMQbcMCreJ/3vMBzmGWrafC3IWs6Q9E6QH/EZbaqyW1bhV4yIix3oWLTfOoJ9i6Gig2zGtC/u9z9S/VlXV4MirBHrFSZPv2rE88xK3Eo45tCVqcLSxPtRqRVcIMXu2+8mlar2qPAgIJcWKXZim8s0D7TJO4a6gjPwKvW0Em6yoZEuyGYCCyzOp51knfEaszMvUPNsoPCEsWFI32n/VROhE/REtpboRbFc5npZwpNFOyjmLejskpmoekmxZJpTE/0jmHWOtCJ/xXvxS6aiY1UhCtnCKcsgUMtI2jbDfYOgDXS/pz3zo/KIUXqp6mvaGKp3+Isvec2hFabW6FAProhrJK3C0m4RsQWKsVqHKR5cokyF2Ex3ghFX24rkFRTfya6nE2b2WRGAvIHMSqk9/dLsqDJhCB/K1OmVqQYTVbWzhEd15L+tazEgI6yb5JprPWqTEjvKWy+lUdq4l+F/NP13LAYR83DkH62hXeQKYOC3g+M/ddOmjOCRCA== X-LD-Processed: 105b2061-b669-4b31-92ac-24d304d195dc,ExtAddr X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0310;20:/fXjwK1bmtZscPPOt7UBPeIVvJCOjGeMd10dnrAQThrchnDMmf2lvLm/wJdaK9v3vCYElRw9Jpeu6wg9pMFfR6y5cy4CBxhKtECDs/Uh3cLRSvfMFyufA6xbYQn/ZpP99LGvpAwp9sW/dAuVsc8xnMkFAWefA8VonLDYDtdEHu+xRTUtpkSqczrhgEgFhSIPrHLnaNnZL2JRmjvZoKl9JkGZBVACVWsWSzAFhsDTnJQ7GNWvdWD/+onA/wvmL76F5dvG1oqoUKNLoJmBrmM9KqpaHJYobFkKLzrexJOTPJ7UtAC/n2Y/bpauRq6oYgOE0nPhtsube6wkgV1d7eMiyQ==;4:wgqHDy1DOVIuy67WRoHQsWkwGquu8eiGxDC/RAZ5PKegmMSdVaYsb7i7YTErOHBLFLTOxIGhWqi/ljRxW1ykKl83UF3Q6KNCmB4tybA8Y90Tf4kP60Z5ItYKez/ZfyXl/040BVPJ6jt+Ef5MwmGf/Qo8QsLSeq3ewLsL5eESN3GsFyHLic8liJyiRPRiC3Vp3IRgdBVsvr5fDsS9jE2qxiXJufGU+cYRz7P/wUqskCU2yUkvuGpr99/jmuGHESHBRL0hvzyQkOoqTimmvRZghhY8hAvb6ax6eqQnqKSy3w0Yk8F7xY1be35yOq+wScespf1XAGqWQAeU7LSQ09Po8Whmc0sHLOEYV+IXwgUKgxFk41ID4Vm9sXsp2/7wPGk5hlFqR6EoThVICYrjofiuKw== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001)(6055026);SRVR:CS1PR84MB0310;BCL:0;PCL:0;RULEID:;SRVR:CS1PR84MB0310; X-Forefront-PRVS: 0911D5CE78 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6049001)(6009001)(377454003)(24454002)(86362001)(80316001)(19580395003)(19580405001)(4001350100001)(92566002)(65816999)(189998001)(110136002)(76176999)(87266999)(54356999)(50986999)(5004730100002)(83506001)(4326007)(47776003)(36756003)(65806001)(50466002)(1096002)(2950100001)(230700001)(164054004)(66066001)(64126003)(2906002)(23756003)(81166005)(42186005)(5008740100001)(586003)(117156001)(6116002)(3846002)(77096005)(33656002);DIR:OUT;SFP:1102;SCL:1;SRVR:CS1PR84MB0310;H:[192.168.142.156];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?iso-8859-1?Q?1;CS1PR84MB0310;23:WsEFCTK5X5O9kf/GAREnnFOUe8xTex+PmOCXAlg?= =?iso-8859-1?Q?NpXMjO1NOV/KO8qRNQ/cO7kI1FOEZb60PRuhgdzO4IzKT0q0XSO1FgYEAI?= =?iso-8859-1?Q?hInklQBtqrH0X7d5HGRGCecfOwzk9QjqVZyTPVn012Zv19sHz7Y3J5ReHg?= =?iso-8859-1?Q?Nf77L9WLbS1wdKvbyj1X8atKqpT5IxP5qKGwSBh3yJ8URglNZysvGkowmP?= =?iso-8859-1?Q?JfHD1K16Kj3ILVoPeEZRk4QsRX5axYBZvyDCB/x0IGjhTc31WD0XZ1GTYf?= =?iso-8859-1?Q?xT5HPrJxhOsXZqnU8iJvRQeo9HQr+BRZHZgBYF3z6hxQwoV21lM36Unpbh?= =?iso-8859-1?Q?oni3IB2XrmzMMXFlBGFv5nfwxdVoYqGTtGZ4JHX3hHKr5sz4UEOSzdScb2?= =?iso-8859-1?Q?w/ydaU5T1Zl87degVqBMMpW3MoWP5yeUumZ+b0h3M0DLFO7sTzzxexIIXQ?= =?iso-8859-1?Q?pkW8u30+bNzl134qfWeCy5Gt7O+A9tDshVG3++xoeFondAqSTBD7UgnV16?= =?iso-8859-1?Q?Flg10ydPpq2mmAioNVv7pNm8P0IUjsvSxwuFBdUK+DoFxJc/DhdxbCBxKt?= =?iso-8859-1?Q?6LTLkGOdinzKiJr7HnosYYfHfq6NTkOyb9Qv2Etjbzr6trr1RnHL2nW98I?= =?iso-8859-1?Q?k/dQp6OdRCOZGVoh27tHUEEM5560044M9njfT3dvQehZ/vdI/D0bBceFrp?= =?iso-8859-1?Q?8dj2ttzFTehI+fKOwDYUjAbFbnAlolqdoHSVg3h0jq6Yo8QCroIc62+YFN?= =?iso-8859-1?Q?0TA7s+lcuXG4qMYLJDBb4WbkIAMhFTQRR4Ob2LWZZTtG96LOg5SSk0cf2G?= =?iso-8859-1?Q?UtrVeYXKZ1/Xb2jQ4QOuAGawiquOTxSLTW4ysT1PZ5ErEJXhLFjUN4G2A9?= =?iso-8859-1?Q?5A2eHXHGVjW6WGDZvXPgoiDg8csczCv7HN6o4R5QThSz35GaoukVxgL4W9?= =?iso-8859-1?Q?P2SqUYSnMDMGXx8UUvAgNNlX/LfyedblAj5GsgoSL89igpGA3FFF+IXIww?= =?iso-8859-1?Q?oQJrmraXZGOhgIgEoCcZRXvDPe9Q2LNoiqJqBdOkHgAyPv2ju5hqm5TnTA?= =?iso-8859-1?Q?oRC1iDGjaEgsZInS9FCpmh2a4xeZfVbUHwafhFhmalbOBGVfuoCnvGGyQ5?= =?iso-8859-1?Q?fKs4GPQ/dgGY2MYJLC1o4OPAMvA=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0310;5:CmnjW4I7+ZBrOV5shWpcbZi8U6Ii0klH2QFE3YTY/heyK3jH15gNBxYHTRGU3I0BHefeRON5eg8oVHCQwAvqneanW0fSfIZhwDHXNI9Pj1PNMzkszymBVPqn+ARjFShp2r2LNTE0JkzxTKg7DaPGDA==;24:A3VeYDsvu6F9VNAFfyUHDZCl0rT3V3KyAJ0TSdYW5OgTKvsURHNhL51atwXuu+djkjVyr4RN0lnYb3HeimoR3QdHZegPFouSnbdjfcbAWTI= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: hpe.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Apr 2016 15:37:26.6716 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: CS1PR84MB0310 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3142 Lines: 68 On 04/13/2016 02:18 AM, Ingo Molnar wrote: > * Waiman Long wrote: > >> On a large system with many CPUs, using HPET as the clock source can >> have a significant impact on the overall system performance because >> of the following reasons: >> 1) There is a single HPET counter shared by all the CPUs. >> 2) HPET counter reading is a very slow operation. >> >> Using HPET as the default clock source may happen when, for example, >> the TSC clock calibration exceeds the allowable tolerance. Something >> the performance slowdown can be so severe that the system may crash >> because of a NMI watchdog soft lockup, for example. >> /* >> + * Reading the HPET counter is a very slow operation. If a large number of >> + * CPUs are trying to access the HPET counter simultaneously, it can cause >> + * massive delay and slow down system performance dramatically. This may >> + * happen when HPET is the default clock source instead of TSC. For a >> + * really large system with hundreds of CPUs, the slowdown may be so >> + * severe that it may actually crash the system because of a NMI watchdog >> + * soft lockup, for example. >> + * >> + * If multiple CPUs are trying to access the HPET counter at the same time, >> + * we don't actually need to read the counter multiple times. Instead, the >> + * other CPUs can use the counter value read by the first CPU in the group. > Hm, weird, so how can this: > > static cycle_t read_hpet(struct clocksource *cs) > { > return (cycle_t)hpet_readl(HPET_COUNTER); > } > > ... cause an actual slowdown of that magnitude? This goes straight to MMIO. So is > the hardware so terminally broken? I only know that accessing the HPET counter is VERY slow. Andy said that it takes at least a few us. I haven't done that measurement myself. I am not sure what kind of contention will happen when multiple CPUs are accessing it at the same time. It is not just the clock tick interrupt handler that need to access time, many system call will also cause the current time to be accessed. When we have hundred of CPUs in the system, it is not too hard to cause a soft lockup if hpet is the default clock source. > How good is the TSC clocksource on the affected system? Could we simply always use > the TSC (and not use the HPET at all as a clocksource), instead of trying to fix > broken hardware? > > Thanks, > > Ingo The TSC clocksource, on the other hand, is per cpu. So there won't be much contention in accessing it. Normally TSC will be used the default clock source. However, if there is too much variation in the actual clock speeds of the individual CPUs, it will cause the TSC calibration to fail and revert to use hpet as the clock source. During bootup, hpet will usually be selected as the default clock source first. After a short time, the TSC will take over as the default clock source. Problem can happen during that short period of transition time too. In fact, we have 16-socket Broadwell-EX systems that has this soft lockup problem once in a few reboot cycles which prompted me to find a solution to fix it. Cheers, Longman